| Title: | A Comprehensive Collection of Neuroscience and Brain-Related Datasets |
|---|---|
| Description: | Offers a rich and diverse collection of datasets focused on the brain, nervous system, and related disorders. The package includes clinical, experimental, neuroimaging, behavioral, cognitive, and simulated data on conditions such as Parkinson's disease, Alzheimer's disease, dementia, epilepsy, schizophrenia, autism spectrum disorder, attention deficit, hyperactivity disorder, Tourette's syndrome, traumatic brain injury, gliomas, migraines, headaches, sleep disorders, concussions, encephalitis, subarachnoid hemorrhage, and mental health conditions. Datasets cover structural and functional brain data, cross-sectional and longitudinal MRI imaging studies, neurotransmission, gene expression, cognitive performance, intelligence metrics, sleep deprivation effects, treatment outcomes, brain-body relationships across species, neurological injury patterns, and acupuncture interventions. Data sources include peer-reviewed studies, clinical trials, military health records, sports injury databases, and international comparative studies. Designed for researchers, neuroscientists, clinicians, psychologists, data scientists, and students, this package facilitates exploratory data analysis, statistical modeling, and hypothesis testing in neuroscience and neuroepidemiology. |
| Authors: | Renzo Caceres Rossi [aut, cre] (ORCID: <https://orcid.org/0009-0005-0744-854X>) |
| Maintainer: | Renzo Caceres Rossi <[email protected]> |
| License: | GPL-3 |
| Version: | 0.3.0 |
| Built: | 2026-06-01 10:13:28 UTC |
| Source: | https://github.com/lightbluetitan/neurodatasets |
This dataset, aba_phenotype_data_df, is a data frame containing brain tissue phenotype measurements from the Allen Brain Atlas Aging, Dementia, and TBI study. The data includes immunohistochemistry markers for microglia and astrocytes across 377 brain samples, intended for correlation analyses with expression data.
data(aba_phenotype_data_df)data(aba_phenotype_data_df)
A data frame with 377 observations and 4 variables:
Character: Brain structure acronym
Numeric: IBA1 immunohistochemistry measurement (microglia marker)
Numeric: GFAP immunohistochemistry measurement (astrocyte marker)
Character: Sample identification code
The dataset name has been kept as 'aba_phenotype_data_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the BRETIGEA package version 1.0.3. Original data from: Allen Brain Atlas Aging, Dementia, and TBI study.
This dataset, ability_intelligence_list, is a list containing psychometric data from six cognitive tests administered to 112 individuals. The list includes a covariance matrix, variable means, and observation count for tests measuring various intellectual abilities.
data(ability_intelligence_list)data(ability_intelligence_list)
A list with 3 components:
Numeric matrix [6×6]: Test score covariance matrix
Numeric vector [6]: Variable means
Numeric: Number of observations (112)
The dataset name has been kept as 'ability_intelligence_list' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'list' indicates that the dataset is a list object. The original content has not been modified.
Data taken from the educationR package version 0.10
This dataset, acupuncture_df, is a data frame from a randomized controlled trial (RCT) evaluating the effectiveness of acupuncture therapy for chronic headaches. The primary outcome was the headache severity score, measured using a 6-item Likert-type scale at the one-year follow-up. The dataset includes group allocation, baseline headache score, one-year follow-up score, and the corresponding change score. Some observations may contain missing values due to omitted cases recorded in the dataset attributes.
data(acupuncture_df)data(acupuncture_df)
A data frame with 301 observations and 4 variables:
Group assignment (integer)
Baseline headache severity score (numeric)
Headache severity score at one-year follow-up (numeric)
Change in headache severity score (numeric)
The dataset name has been kept as acupuncture_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the R4HCR package version 0.1
This dataset, AD_biomarkers_tbl_df, is a tibble containing clinical data from 333 patients in a study of Alzheimer's disease biomarkers. The study included patients with mild cognitive impairment and healthy controls, with measurements of demographic characteristics, apolipoprotein E genotype, protein biomarkers (including Abeta, Tau, and pTau), and clinical dementia scores.
data(AD_biomarkers_tbl_df)data(AD_biomarkers_tbl_df)
A tibble with 333 observations and 131 variables:
Numeric: Patient age
Numeric: Indicator for male gender (1 = male, 0 = female)
Factor: Apolipoprotein E genotype
Factor: Clinical classification (e.g., healthy, impaired)
Numeric: Amyloid-beta 42 protein measurement
Numeric: Tau protein measurement
Numeric: Phosphorylated Tau protein measurement
Numeric measurements of various proteins and biomarkers
The dataset name has been kept as 'AD_biomarkers_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified.
Data taken from the modeldata package version 1.4.0. Original study: Craig-Schapiro R, Kuhn M, Xiong C, et al. (2011) Multiplexed Immunoassay Panel Identifies Novel CSF Biomarkers for Alzheimer's Disease Diagnosis and Prognosis. PLoS ONE 6(4): e18850.
This dataset, ADHD_df, is a data frame containing ADHD symptom ratings for 355 children aged 6 to 8 years from the Children's Attention Project (CAP) cohort (Silk et al. 2019). The sample consists of 146 children diagnosed with ADHD and 209 without a diagnosis. Symptoms were assessed through structured interviews with parents using the NIMH Diagnostic Interview Schedule for Children IV (DISC-IV) (Shaffer et al. 2000). The checklist includes 18 items: 9 Inattentive (I) and 9 Hyperactive/Impulsive (HI). Each symptom item is binary coded (1 = present, 0 = absent), providing a comprehensive assessment of ADHD symptomatology in young children.
data(ADHD_df)data(ADHD_df)
A data frame with 355 observations and 19 variables:
Group indicator (integer: 1 = ADHD diagnosis, 0 = no diagnosis)
Avoids tasks requiring sustained mental effort (integer: 0 or 1)
Fails to give close attention to details (integer: 0 or 1)
Easily distracted by extraneous stimuli (integer: 0 or 1)
Forgetful in daily activities (integer: 0 or 1)
Fails to follow through on instructions (integer: 0 or 1)
Does not seem to listen when spoken to directly (integer: 0 or 1)
Loses things necessary for tasks or activities (integer: 0 or 1)
Difficulty organizing tasks and activities (integer: 0 or 1)
Difficulty sustaining attention in tasks or play (integer: 0 or 1)
Blurts out answers before questions are completed (integer: 0 or 1)
Fidgets with hands or feet or squirms in seat (integer: 0 or 1)
Interrupts or intrudes on others (integer: 0 or 1)
Acts as if driven by a motor (integer: 0 or 1)
Difficulty playing or engaging quietly in leisure activities (integer: 0 or 1)
Runs about or climbs excessively in inappropriate situations (integer: 0 or 1)
Leaves seat in situations when remaining seated is expected (integer: 0 or 1)
Talks excessively (integer: 0 or 1)
Difficulty waiting turn (integer: 0 or 1)
The dataset name has been kept as ADHD_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the bgms package version 0.1.6.1
This dataset, adolescent_mental_health_df, is a data frame containing mental health assessments from the National Longitudinal Study of Adolescent Health. The data includes depression and anxiety measures for 4,344 students in grades 7-12 from a cross-sectional sample analyzed by Warne (2014).
data(adolescent_mental_health_df)data(adolescent_mental_health_df)
A data frame with 4,344 observations and 3 variables:
Ordered factor with 6 levels: School grade (7-12)
Integer: Depression symptom score
Integer: Anxiety symptom score
The dataset name has been kept as 'adolescent_mental_health_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the heplots package version 1.7.4. Original analysis: Warne, R.T. (2014) A primer on Multivariate Analysis of Variance (MANOVA) for Behavioral Scientists. Practical Assessment, Research & Evaluation, 19(1).
This dataset, alzheimer_smoking_df, is a data frame containing case-control data from a study examining the association between smoking and Alzheimer's disease. The study included 538 participants with information on smoking status, disease classification, and gender.
data(alzheimer_smoking_df)data(alzheimer_smoking_df)
A data frame with 538 observations and 3 variables:
Factor: Smoking status of participants (4 levels)
Factor: Disease classification including Alzheimer's diagnosis (3 levels)
Factor: Participant's gender (2 levels)
The dataset name has been kept as 'alzheimer_smoking_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the coin package version 1.4-3. Original study: Salib, E. and Hillier, V. (1997). A case-control study of smoking and Alzheimer's disease. International Journal of Geriatric Psychiatry 12: 295-300.
This dataset, ASD_risks_tbl_df, is a tibble containing information on various clinical, behavioral, genetic, and developmental factors associated with the risk of developing Autism Spectrum Disorder (ASD) traits in children. The dataset consists of 1,985 observations and 28 variables, including the Autism Spectrum Quotient items (A1–A10), Social Responsiveness Scale, Qchat-10 score, Childhood Autism Rating Scale, and multiple indicators related to speech, learning, genetics, mental health, developmental delays, behavioral issues, demographics, and family history. The final column indicates whether the child is expected to develop ASD traits in the future (0 or 1).
data(ASD_risks_tbl_df)data(ASD_risks_tbl_df)
A tibble with 1,985 observations and 28 variables:
Patient case identifier (numeric)
Autism Spectrum Quotient item A1 (numeric)
Autism Spectrum Quotient item A2 (numeric)
Autism Spectrum Quotient item A3 (numeric)
Autism Spectrum Quotient item A4 (numeric)
Autism Spectrum Quotient item A5 (numeric)
Autism Spectrum Quotient item A6 (numeric)
Autism Spectrum Quotient item A7 (numeric)
Autism Spectrum Quotient item A8 (numeric)
Autism Spectrum Quotient item A9 (numeric)
Autism Spectrum Quotient item A10 (numeric)
Social Responsiveness Scale score (numeric)
Age in years (numeric)
Q-CHAT-10 score (numeric)
Indicator of speech delay or language disorder (character)
Indicator of learning disorder (character)
Presence of genetic disorders (character)
Presence of depression (character)
Indicator of global developmental delay or intellectual disability (character)
Presence of social or behavioral issues (character)
Childhood Autism Rating Scale score (numeric)
Presence of anxiety disorder (character)
Sex of the participant (character)
Ethnicity of the participant (character)
History of jaundice (character)
Indicator of family member with ASD (character)
Relationship of the respondent who completed the test (character)
Indicator of whether the child is expected to develop ASD traits (character)
The dataset name has been kept as ASD_risks_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/uppulurimadhuri/dataset
This dataset, bilingual_brains_df, is a data frame containing measurements of second language proficiency scores and gray matter density in the left inferior parietal region from 22 observations.
data(bilingual_brains_df)data(bilingual_brains_df)
A data frame with 22 observations and 2 variables:
Numeric vector representing second language proficiency scores (summary of reading, writing, and speech)
Numeric vector representing density of gray matter in the left inferior parietal region
The dataset name has been kept as 'bilingual_brains_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the abd package version 0.2-8
This dataset, blood_brain_barrier_df, is a data frame containing experimental measurements from a rat study investigating sugar-infusion methods for temporary blood-brain barrier disruption. The barrier's protective function was assessed through multiple biological markers.
data(blood_brain_barrier_df)data(blood_brain_barrier_df)
A data frame with 34 observations and 9 variables:
Integer: Brain tissue measurement (units?)
Integer: Liver tissue measurement (units?)
Numeric: Experimental time measurement (hours)
Factor with 2 levels: Experimental treatment groups
Integer: Observation period (days)
Factor with 2 levels: Animal sex (Male/Female)
Integer: Subject weight (grams)
Numeric: Physiological loss measurement
Integer: Tumor presence indicator (0/1)
The dataset name has been kept as 'blood_brain_barrier_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the Sleuth3 package version 1.0-6. Original reference: Ramsey, F.L. and Schafer, D.W. (2013) The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed), Cengage Learning.
This dataset, brain_litter_mammals_df, is a data frame comparing relative brain weights
between 96 mammalian species divided by reproductive strategy: 51 species with small litters
( offspring) and 45 species with large litters ( offspring).
data(brain_litter_mammals_df)data(brain_litter_mammals_df)
A data frame with 96 observations and 2 variables:
Numeric: Relative brain weight measurement (encephalization quotient or similar metric)
Factor with 2 levels: Reproductive strategy ("Small" () and "Large" () litter sizes)
The dataset name has been kept as brain_litter_mammals_df to avoid confusion
with other datasets in the R ecosystem. This naming convention helps distinguish
this dataset as part of the NeuroDataSets package. The suffix df indicates
that the dataset is a data frame. The original content has not been modified.
Data taken from the Sleuth3 package version 1.0-6. Original reference: Ramsey, F.L. and Schafer, D.W. (2002) The Statistical Sleuth: A Course in Methods of Data Analysis (2nd ed), Duxbury.
This dataset, brain_size_iq_df, is a data frame containing neurocognitive measurements from a study examining relationships between brain size, gender, and intelligence. The data include 40 right-handed psychology students with no neurological history, selected based on extreme Scholastic Aptitude Test scores.
data(brain_size_iq_df)data(brain_size_iq_df)
A data frame with 40 observations and 7 variables:
Numeric: Participant identification number
Factor with 2 levels: Participant's gender (Male/Female)
Numeric: Full Scale IQ score
Numeric: Verbal IQ score
Numeric: Performance IQ score
Numeric: Brain size measurement from MRI (in cubic cm)
Factor with 2 levels: IQ group classification (High/Low)
The dataset name has been kept as 'brain_size_iq_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the sur package version 1.0.4. Original study: Willerman, L., Schultz, R., Rutledge, J.N. and Bigler, E.D. (1991) In Vivo Brain Size and Intelligence. Intelligence, 15, 223-228.
This dataset, brain_string_players_df, is a data frame containing neurophysiological measurements from a study of 15 violin and other string instrument players. The data examines the relationship between years of musical practice and measured brain activity levels in relevant cortical regions.
data(brain_string_players_df)data(brain_string_players_df)
A data frame with 15 observations and 2 variables:
Integer: Years of musical practice
Numeric: Brain activity measurement (likely fMRI or similar neuroimaging units)
The dataset name has been kept as 'brain_string_players_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the Sleuth3 package version 1.0-6. Original reference: Ramsey, F.L. and Schafer, D.W. (2013) The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed), Cengage Learning.
This dataset, brainexpression_df, is a data frame containing expression levels of the proteolipid protein 1 gene (PLP1) in 45 individuals across three groups. The dataset includes group classifications and corresponding PLP1 expression measurements, making it useful for comparative gene expression analysis and studying differences in myelin-related protein expression across populations.
data(brainexpression_df)data(brainexpression_df)
A data frame with 45 observations and 2 variables:
Group classification (factor with 3 levels)
Expression level of the proteolipid protein 1 gene (numeric)
The dataset name has been kept as brainexpression_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the abd package version 0.2-8
This dataset, brains_cognitive_matrix, is a matrix containing the states and covariates
of 649 participants enrolled in the BRAiNS cohort at the University of Kentucky's
Alzheimer's Disease Research Center. The data includes longitudinal cognitive assessments
and various health covariates across multiple visits.
data(brains_cognitive_matrix)data(brains_cognitive_matrix)
A matrix with 6240 observations and 13 variables:
Integer: Participant identification number
Integer: Visit number
Integer: Previous cognitive state
Integer: Current cognitive state
Integer: Baseline age centered
Integer: Family history of dementia (0 = No, 1 = Yes)
Integer: History of high blood pressure (0 = No, 1 = Yes)
Integer: APOE allele carrier status (0 = Non-carrier, 1 = Carrier)
Integer: Smoking status indicator 1
Integer: Smoking status indicator 2
Integer: Smoking status indicator 3
Integer: Low education indicator (0 = No, 1 = Yes)
Integer: History of head injury (0 = No, 1 = Yes)
The dataset name has been kept as brains_cognitive_matrix to avoid confusion
with other datasets in the R ecosystem. This naming convention helps distinguish
this dataset as part of the NeuroDataSets package. The suffix matrix indicates
that the dataset is a matrix. The original content has not been modified.
Data taken from the RRMLRfMC package version 0.4.0. Original study: University of Kentucky's Alzheimer's Disease Research Center BRAiNS cohort.
This dataset, brainvolume_df, is a data frame containing 83 empirical studies included in the meta-analysis by Pietschnig, Penke, Wicherts, Zeiler, and Voracek (2015), which examined the association between human brain volume and intelligence as measured by full-scale IQ. The dataset includes study identifiers, publication year, correlation coefficients, Fisher’s z-transformed values, standard errors, sample sizes, sex composition, and mean participant age. These data provide a comprehensive resource for investigating population-level relationships between brain volume and cognitive ability.
data(brainvolume_df)data(brainvolume_df)
A data frame with 83 observations and 8 variables:
Study identifier (character)
Year of publication (integer)
Correlation coefficient between brain volume and intelligence (numeric)
Fisher’s z-transformed correlation (numeric)
Standard error of the Fisher’s z value (numeric)
Sample size (integer)
Sex composition of the sample (factor with 4 levels)
Mean age of participants (numeric)
The dataset name has been kept as brainvolume_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the metaviz package version 0.3.1
This dataset, cerebellar_age_df, is a data frame containing repeated measurements of age and adjusted volume of cerebellar hemispheres from 72 participants. Each participant was measured on two occasions (Time), resulting in a total of 144 observations. The measurements were captured from Figure 8, Cerebellar Hemispheres (lower right) of Raz et al. (2005). The dataset includes participant identifiers, measurement time, age, and cerebellar hemisphere volume. Some observations may contain missing values.
data(cerebellar_age_df)data(cerebellar_age_df)
A data frame with 144 observations and 4 variables:
Participant ID (integer)
Measurement occasion (integer)
Age of the participant (numeric)
Adjusted cerebellar hemisphere volume (numeric)
The dataset name has been kept as cerebellar_age_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the rmcorr package version 0.7.0
This dataset, chimpbrains_df, is a data frame containing measurements of asymmetry in Brodmann's area 44 for 20 chimpanzees. Brodmann's area 44 is a brain region associated with language processing in humans and is located in the inferior frontal gyrus. The dataset includes individual identifiers, sex, and asymmetry measurements, providing insights into neural lateralization patterns in non-human primates. This data can be useful for comparative neuroanatomy studies and understanding the evolution of language-related brain structures.
data(chimpbrains_df)data(chimpbrains_df)
A data frame with 20 observations and 3 variables:
Individual chimpanzee identifier (factor with 20 levels)
Sex of the chimpanzee (factor with 2 levels: "F" = female, "M" = male)
Asymmetry measurement of Brodmann's area 44 (numeric)
The dataset name has been kept as chimpbrains_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the abd package version 0.2-8
This dataset, cocaine_dopamine_df, is a data frame containing measurements of dopamine receptor blockade and perceived high levels from 34 human subjects as determined by PET scans.
data(cocaine_dopamine_df)data(cocaine_dopamine_df)
A data frame with 34 observations and 2 variables:
Integer vector representing percent of dopamine receptors blocked
Integer vector representing perceived level of high from PET scans
The dataset name has been kept as 'cocaine_dopamine_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the abd package version 0.2-8
This dataset, 'DA_schizophrenia_tbl_df', is a tibble containing measurements
of dopamine -hydroxylase (DBH) activity in 25 schizophrenic patients treated
with antipsychotic medication. The data compares DBH levels between patient groups.
data(DA_schizophrenia_tbl_df)data(DA_schizophrenia_tbl_df)
A tibble with 25 observations and 2 variables:
Integer: Dopamine -hydroxylase activity level (nmol/(mLhr))
Character: Treatment/patient group classification
The dataset name has been changed to DA_schizophrenia_tbl_df to provide a shorter,
neuroscience-standard abbreviation where "DA" refers to dopamine. This naming convention
maintains clarity and consistency within the NeuroDataSets package. The suffix
tbl_df indicates that the dataset is a tibble. The original content has not been modified.
Data taken from the BSDA package version 1.2.2
This dataset, dementia_df, is a data frame containing information related to dementia assessment. The data includes dementia scores along with demographic variables such as age and sex, as well as study identifiers. The dataset consists of 1,000 observations across 4 variables and was originally sourced from the PBImisc package. This dataset can be useful for analyzing patterns in dementia scores across different demographic groups and studies.
data(dementia_df)data(dementia_df)
A data frame with 1,000 observations and 4 variables:
Dementia score (integer)
Age group of the participant (factor with 2 levels)
Sex of the participant (factor with 2 levels)
Study identifier (factor with 10 levels)
The dataset name has been kept as dementia_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the PBImisc package version 1.0
This dataset, encephalitis_df, is a data frame containing reported cases of herpes encephalitis in children from the regions of Bavaria and Lower Saxony. The data were collected between 1980 and 1993 as part of a study investigating the occurrence of herpes encephalitis in pediatric populations. The dataset includes the year of observation, regional identifiers, and the corresponding case counts, providing valuable information for epidemiological and public health research.
data(encephalitis_df)data(encephalitis_df)
A data frame with 26 observations and 3 variables:
Year of recorded cases (integer)
Regional identifier (integer)
Number of reported herpes encephalitis cases (integer)
The dataset name has been kept as encephalitis_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the catdata package version 1.2.4
This dataset, epilepsy_drug_qol_df, is a data frame containing quality of life measurements from the SANAD randomized controlled trial comparing carbamazepine and lamotrigine in 544 epilepsy patients. QoL assessments were collected at baseline, 3 months, 1 year and 2 years using validated instruments.
data(epilepsy_drug_qol_df)data(epilepsy_drug_qol_df)
A data frame with 1,852 observations and 9 variables:
Integer: Patient identification number
Numeric: Time to withdrawal/discontinuation (days)
Factor with 2 levels: Treatment group (carbamazepine/lamotrigine)
Integer: Withdrawal status indicator
Numeric: Assessment time point (days since baseline)
Numeric: Anxiety score (from QoL measure)
Numeric: Depression score (from QoL measure)
Numeric: Adverse effects profile score
Numeric: Alternative withdrawal coding
The dataset name has been kept as 'epilepsy_drug_qol_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the joineRML package version 0.4.7. Original study: Marson, A.G., et al. (2007) The SANAD study of effectiveness of carbamazepine, gabapentin, lamotrigine, oxcarbazepine, or topiramate for treatment of partial epilepsy: an unblinded randomised controlled trial. The Lancet, 369(9566), 1000-1015.
This dataset, epilepsy_drug_trial_df, is a data frame containing seizure counts from a clinical trial of anti-epileptic medication. The data includes seizure frequency measurements along with treatment indicators and patient covariates for 295 observations.
data(epilepsy_drug_trial_df)data(epilepsy_drug_trial_df)
A data frame with 295 observations and 6 variables:
Numeric: Count of epileptic seizures
Integer: Patient identification number
Numeric: Treatment indicator
Numeric: Exposure period indicator
Numeric: Adjusted time period
Numeric: Patient age in years
The dataset name has been kept as 'epilepsy_drug_trial_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the faraway package version 1.0.9
This dataset, epilepsy_RCT_tbl_df, is a tibble containing data from a randomized controlled trial of progabide for epilepsy treatment. The trial recorded seizure counts for 59 patients at baseline and four follow-up visits.
data(epilepsy_RCT_tbl_df)data(epilepsy_RCT_tbl_df)
A tibble with 59 observations and 8 variables:
Integer: Patient identification number
Factor with 2 levels: Treatment group (progabide/control)
Integer: Baseline seizure count
Integer: Patient age in years
Integer: Seizure count at first follow-up
Integer: Seizure count at second follow-up
Integer: Seizure count at third follow-up
Integer: Seizure count at fourth follow-up
The dataset name has been kept as 'epilepsy_RCT_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified.
Data taken from the pubh package version 2.0.0
This dataset, gm_expected_patterns_tbl_df, is a tibble containing expected patterns of gray matter in schizophrenia derived from large-scale meta-analyses by the ENIGMA consortium. It includes data from multiple neurological and psychiatric conditions for comparison.
data(gm_expected_patterns_tbl_df)data(gm_expected_patterns_tbl_df)
A tibble with 33 observations and 16 variables:
Character vector indicating gray matter regions
Numeric vector of expected patterns for schizophrenia spectrum disorder
Numeric vector of expected patterns for major depressive disorder
Numeric vector of expected patterns for Alzheimer's disease (ADNI cohort)
Numeric vector of expected patterns for Alzheimer's disease (ADNI+OSYRIX cohort)
Numeric vector of expected patterns for bipolar disorder
Numeric vector of expected patterns for Parkinson's disease
Numeric vector of expected patterns for diabetes
Numeric vector of expected patterns for high blood pressure
Numeric vector of expected patterns for high lipids
Numeric vector of expected patterns for metabolic syndrome
Numeric vector of expected patterns for 22q11.2 deletion syndrome
Numeric vector of expected patterns for suicide
Numeric vector of expected patterns for pediatric OCD
Numeric vector of expected patterns for adult OCD
Numeric vector of expected patterns for anorexia nervosa
The dataset name has been kept as 'gm_expected_patterns_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Data taken from the RVIpkg package version 0.3.2.
This dataset, guineapig_neuro_df, is a data frame containing measurements of spontaneous current amplitudes recorded from individual brain cells in adult guinea pigs. The study investigated whether synaptic transmission occurs in quantal units, which would manifest as multimodal amplitude distributions with regularly spaced peaks.
data(guineapig_neuro_df)data(guineapig_neuro_df)
A data frame with 346 observations and 1 variable:
Numeric: Peak amplitude of spontaneous synaptic currents (pA or similar units)
The dataset name has been updated to 'guineapig_neuro_df' for clarity and brevity while preserving consistency with other datasets in the NeuroDataSets package. The suffix 'df' indicates that the dataset is a standard data frame.
Data taken from the boot package version 1.3-31. Original study: Paulsen, O. and Heggelund, P. (1994) The quantal size at retinogeniculate synapses determined from spontaneous and evoked EPSCs in guinea-pig thalamic slices. Journal of Physiology, 480, 505–511.
This dataset, hippocampus_lesions_df, is a data frame containing measurements of spatial memory scores and percent lesion of the hippocampus from 57 observations.
data(hippocampus_lesions_df)data(hippocampus_lesions_df)
A data frame with 57 observations and 2 variables:
Numeric vector representing percent lesion of the hippocampus
Numeric vector representing spatial memory scores
The dataset name has been kept as 'hippocampus_lesions_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the abd package version 0.2-8
This dataset, iq_country_tbl_df, is a tibble containing information on the average intelligence quotient (IQ) of countries around the world. In addition to average IQ scores, the dataset includes several socioeconomic and demographic indicators such as literacy rate, number of Nobel Prizes won collectively by each country, Human Development Index (HDI, 2021), mean years of schooling (2021), gross national income (GNI, 2021), and population estimates for 2023. These variables provide a broad context for understanding cognitive performance at the country level.
data(iq_country_tbl_df)data(iq_country_tbl_df)
A tibble with 193 observations and 10 variables:
Global ranking based on average IQ (numeric)
Name of the country (character)
Estimated average IQ score of the population (numeric)
Continent to which the country belongs (character)
Literacy rate of the population (numeric)
Total number of Nobel Prizes won collectively by the country (numeric)
Human Development Index for the year 2021 (numeric)
Average years of schooling in 2021 (numeric)
Gross national income for 2021 (numeric)
Estimated population in 2023 (character)
The dataset name has been kept as iq_country_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/mlippo/average-global-iq-per-country-with-other-stats
This dataset, mammals_brain_body_df, is a data frame containing comparative neuroanatomical and life history data for 96 mammalian species. The data examine the relationship between brain size, body size, and reproductive characteristics across different mammal species.
data(mammals_brain_body_df)data(mammals_brain_body_df)
A data frame with 96 observations and 5 variables:
Factor with 96 levels: Mammalian species names
Numeric: Brain weight (grams)
Numeric: Body weight (kilograms)
Integer: Gestation period (days)
Numeric: Average litter size
The dataset name has been kept as 'mammals_brain_body_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the Sleuth3 package version 1.0-6. Original study: Allison, T. and Cicchetti, D.V. (1976) Sleep in Mammals: Ecological and Constitutional Correlates. Science, 194, 732-734.
This dataset, markers_brain_df, is a data frame containing the top 1,000 marker genes for each of six major brain cell types (astrocytes, endothelial cells, microglia, neurons, oligodendrocytes, and OPCs) identified through meta-analysis of both human and mouse brain gene expression data.
data(markers_brain_df)data(markers_brain_df)
A data frame with 6,000 observations and 2 variables:
Character: Gene symbol for cell-type specific marker (human/mouse orthologs)
Character: Cell type classification (astrocytes/endothelial/microglia/neurons/oligodendrocytes/OPCs)
The dataset name has been kept as 'markers_brain_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the BRETIGEA package version 1.0.3. Derived from: Meta-analysis of human and mouse brain cell-type specific gene expression datasets.
This dataset, markers_human_brain_df, is a data frame containing the top 1,000 marker genes for each of six major brain cell types (astrocytes, endothelial cells, microglia, neurons, oligodendrocytes, and OPCs) identified through meta-analysis of human brain gene expression data.
data(markers_human_brain_df)data(markers_human_brain_df)
A data frame with 5,500 observations and 2 variables:
Character: Gene symbol for cell-type specific marker
Character: Cell type classification (astrocytes/endothelial/microglia/neurons/oligodendrocytes/OPCs)
The dataset name has been kept as 'markers_human_brain_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the BRETIGEA package version 1.0.3.
This dataset, markers_mouse_brain_df, is a data frame containing the top 1,000 marker genes for each of six major brain cell types (astrocytes, endothelial cells, microglia, neurons, oligodendrocytes, and OPCs) identified through meta-analysis of mouse brain gene expression data.
data(markers_mouse_brain_df)data(markers_mouse_brain_df)
A data frame with 5,430 observations and 2 variables:
Character: Gene symbol for cell-type specific marker
Character: Cell type classification (astrocytes/endothelial/microglia/neurons/oligodendrocytes/OPCs)
The dataset name has been kept as 'markers_mouse_brain_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the BRETIGEA package version 1.0.3. Original study: Mckenzie AT, Wang M, Hauberg ME, et al. (2018) Brain Cell Type Specific Gene Expression and Co-expression Network Architectures. Scientific Reports, 8(1), 8868.
This dataset, migraine_treatment_df, is a data frame containing clinical data on 4,152 migraine treatment cases collected by Tammy Kostecki-Dillon. The data includes treatment details, headache characteristics, and patient demographics.
data(migraine_treatment_df)data(migraine_treatment_df)
A data frame with 4,152 observations and 9 variables:
Integer: Patient identification number
Integer: Time measurement (likely days or hours)
Integer: Treatment dosage
Factor with 3 levels: Headache type classification
Integer: Patient age in years
Numeric: Air quality index measurement
Factor with 3 levels: Medication type
Factor with 2 levels: Headache presence/severity
Factor with 2 levels: Patient sex
The dataset name has been kept as 'migraine_treatment_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified.
Data taken from the carData package version 3.0-5. Original collection: Kostecki-Dillon, T. (Year not specified) Migraine Treatment Study.
This dataset, migraines_df, is a data frame containing data on the effects of transcranial magnetic stimulation (TMS) on migraine headaches. The dataset includes two groups along with counts of participants who reported improvement (“Yes”), no improvement (“No”), and the total number of trials. These data are useful for evaluating the potential therapeutic impact of TMS on migraine symptoms.
data(migraines_df)data(migraines_df)
A data frame with 2 observations and 4 variables:
Group indicator (factor with 2 levels)
Number of participants reporting improvement (integer)
Number of participants reporting no improvement (integer)
Total number of trials (integer)
The dataset name has been kept as migraines_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the Stat2Data package version 2.0.0
This dataset, migrane_dose_df, is a data frame obtained from a randomized, placebo-controlled dose–response clinical trial for the treatment of acute migraine (clinicaltrials.gov identifier NCT00712725). The primary endpoint was “pain freedom at 2 hours postdose,” measured as a binary outcome. The dataset includes dose levels, the number of participants achieving pain freedom, and the total number of treated participants at each dose level. These data are useful for dose–response modeling and clinical trial analysis in migraine research.
data(migrane_dose_df)data(migrane_dose_df)
A data frame with 8 observations and 3 variables:
Dose level administered (numeric)
Number of participants who achieved pain freedom at 2 hours postdose (integer)
Total number of treated participants at the corresponding dose level (integer)
The dataset name has been kept as migrane_dose_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the DoseFinding package version 1.4-1
This dataset, neanderthal_brains_df, is a data frame containing measurements of brain size (lnbrain) and body mass (lnmass) from 39 specimens of Neanderthals and early modern humans, identified by species.
data(neanderthal_brains_df)data(neanderthal_brains_df)
A data frame with 39 observations and 3 variables:
Numeric vector representing natural logarithm of body mass
Numeric vector representing natural logarithm of brain size
Factor indicating species with 2 levels (Neanderthals and early modern humans)
The dataset name has been kept as 'neanderthal_brains_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the abd package version 0.2-8
This dataset, neuro_pointprocess_matrix, is a matrix containing times of observed neuronal firing in windows of 250ms surrounding stimulus application in human subjects. Each row represents an experimental replication (469 total replicates), with values indicating spike times relative to stimulus onset.
data(neuro_pointprocess_matrix)data(neuro_pointprocess_matrix)
A numeric matrix with 469 observations (rows) and 6 variables (columns):
Numeric: Spike times (milliseconds) relative to stimulus onset, with NA representing no spike in that trial window
The dataset name has been kept as 'neuro_pointprocess_matrix' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'matrix' indicates that the dataset is a matrix. The original content has not been modified.
Data taken from the boot package version 1.3-31. Original collection: Dr. S.J. Boniface, Neurophysiology Unit, Radcliffe Infirmary, Oxford.
This package provides a diverse collection of datasets focused on the brain, nervous system, and related disorders. The package includes clinical, experimental, neuroimaging, behavioral, and cognitive data on conditions including Parkinson's, Alzheimer's, epilepsy, schizophrenia, autism, ADHD, Tourette's, TBI, brain tumors, migraines, sleep disorders, and mental health.
NeuroDataSets: A Comprehensive Collection of Neuroscience and Brain-Related Datasets
A Comprehensive Collection of Neuroscience and Brain-Related Datasets.
Maintainer: Renzo Caceres Rossi [email protected]
Useful links:
This dataset, neurodeg_dose_df, is a data frame containing simulated longitudinal data from a Phase 2 clinical study of a potential treatment for a neurodegenerative disease. The disease state is assessed using a functional scale, where smaller values indicate more severe neurodeterioration. The primary goal of the drug is to slow disease progression, which is quantified through the linear slope of the functional scale over time. The dataset includes repeated measurements for multiple individuals across varying dose levels, allowing investigation of dose–response relationships in disease progression.
data(neurodeg_dose_df)data(neurodeg_dose_df)
A data frame with 1250 observations and 4 variables:
Measured value of the functional scale (numeric)
Participant identifier (integer)
Dose level administered (numeric)
Measurement time point (numeric)
The dataset name has been kept as neurodeg_dose_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the DoseFinding package version 1.4-1
This dataset, nfl_concussions_tbl_df, is a tibble containing detailed information on concussion injuries that occurred in the National Football League (NFL) from 2012 to 2014. The dataset includes hundreds of recorded concussion cases, capturing information such as player identity, team, game, date of injury, position, whether the injury occurred during pre-season, and multiple injury-related details including weeks injured, games missed, and reported injury type.
data(nfl_concussions_tbl_df)data(nfl_concussions_tbl_df)
A tibble with 392 observations and 18 variables:
Unique identifier for each concussion record (character)
Name of the player who sustained the concussion (character)
Team of the injured player (character)
Game in which the injury occurred (character)
Date of the concussion incident (character)
Opponent team during the game (character)
Player's position (character)
Indicates if the injury occurred during pre-season (character)
Indicates if the player’s team won the game (character)
Week number of the season when the injury occurred (numeric)
NFL season year associated with the injury (character)
Number of weeks the player was injured (numeric)
Number of games missed due to the concussion (numeric)
Indicates if the injury type was unknown (character)
Reported type of concussion injury (character)
Total snaps played before injury (numeric)
Playtime after injury occurred (character)
Average playtime before injury (character)
The dataset name has been kept as nfl_concussions_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/rishidamarla/concussions-in-the-nfl-20122014
This dataset, OASIS_cross_tbl_df, is a tibble containing a cross-sectional collection of MRI data from 436 individuals aged 18 to 96, obtained as part of the Open Access Series of Imaging Studies (OASIS). For each subject, 3 or 4 T1-weighted MRI scans acquired during a single scanning session are included. All participants are right-handed and include both men and women. Among the subjects over the age of 60, 100 have been clinically diagnosed with very mild to moderate Alzheimer’s disease (AD).
data(OASIS_cross_tbl_df)data(OASIS_cross_tbl_df)
A tibble with 436 observations and 12 variables:
Subject identifier (character)
Sex of the participant (character)
Handedness of the participant (character)
Age in years (numeric)
Years of education (numeric)
Socioeconomic status score (numeric)
Mini-Mental State Examination score (numeric)
Clinical Dementia Rating score (numeric)
Estimated total intracranial volume (numeric)
Normalized whole-brain volume (numeric)
Atlas scaling factor (numeric)
Inter-scan interval in days (character)
The dataset name has been kept as OASIS_cross_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/jboysen/mri-and-alzheimers
This dataset, OASIS_long_tbl_df, is a tibble containing a longitudinal collection of MRI data from 150 subjects aged 60 to 96, obtained as part of the Open Access Series of Imaging Studies (OASIS). Each participant completed two or more MRI sessions, with visits spaced at least one year apart, resulting in a total of 373 imaging sessions. The dataset includes both nondemented and demented older adults and provides comprehensive demographic, clinical, and neuroimaging measures for each visit.
data(OASIS_long_tbl_df)data(OASIS_long_tbl_df)
A tibble with 373 observations and 15 variables:
Unique identifier for each subject (character)
Identifier for each MRI session (character)
Clinical group classification (character)
Visit number for longitudinal assessment (numeric)
Time in days between MRI sessions (numeric)
Sex of the participant (character)
Handedness of the participant (character)
Age in years at the time of the visit (numeric)
Years of education (numeric)
Socioeconomic status score (numeric)
Mini-Mental State Examination score (numeric)
Clinical Dementia Rating score (numeric)
Estimated total intracranial volume (numeric)
Normalized whole-brain volume (numeric)
Atlas scaling factor (numeric)
The dataset name has been kept as OASIS_long_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/jboysen/mri-and-alzheimers
This dataset, parkinsons_dopamine_list, is a list containing information from 7 studies investigating the mean lost work-time reduction in patients given 4 dopamine agonists and placebo as adjunct therapy for Parkinson's disease. There is placebo and four active drugs coded 2 to 5.
data(parkinsons_dopamine_list)data(parkinsons_dopamine_list)
A list with 5 components:
Numeric vector containing the outcomes (mean lost work-time reduction)
Numeric vector containing standard errors for the outcomes
Character vector indicating the treatment (placebo or drug codes 2-5)
Numeric vector indicating the study number (1-7)
Character vector showing the treatment order (placebo and drugs 2-5)
The dataset name has been kept as 'parkinsons_dopamine_list' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'list' indicates that the dataset is a list. The original content has not been modified in any way.
Data taken from the bnma package version 1.6.0.
This dataset, pediatric_glioma_tbl_df, is a tibble containing comprehensive clinical and tumor characteristics for 57 pediatric patients with high-grade glioma. The data includes 22 variables covering demographic, symptomatic, pathological, treatment, and outcome measures.
data(pediatric_glioma_tbl_df)data(pediatric_glioma_tbl_df)
A tibble with 57 observations and 22 variables:
Numeric: Patient age in years
Character: Patient gender
Character: Headache presence/characteristics
Character: Epilepsy status
Character: Hemiparesis presence
Character: Increased intracranial pressure indicators
Character: Tumor pathology classification
Numeric: WHO tumor grade (III-IV)
Character: Thalamic involvement
Character: Bilateral extension
Character: Posterior fossa extension
Character: Brainstem involvement
Character: Multifocal tumor presence
Character: Midline shift presence
Character: Peritumoral edema characteristics
Numeric: Estimated tumor volume (cm³)
Character: Surgical resection extent
Character: Ventricular shunt presence
Character: Post-surgical residual tumor
Character: Neurological status
Numeric: Performance status pre-radiotherapy
Character: Mortality outcome
The dataset name has been kept as 'pediatric_glioma_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified.
Kaggle dataset: Pediatric High-Grade Glioma Dataset. URL: https://www.kaggle.com/datasets/amraam/pediatric-high-grade-glioma-dataset
This dataset, psych_neurocog_df, is a data frame containing comprehensive neurocognitive assessments from a study comparing performance patterns in schizophrenia, schizoaffective disorder, and controls. The data includes 242 observations across multiple cognitive domains using a psychosis-specific neurocognitive battery.
data(psych_neurocog_df)data(psych_neurocog_df)
A data frame with 242 observations and 10 variables:
Factor with 3 levels: Diagnostic group (Schizophrenia/Schizoaffective/Control)
Integer: Processing speed score
Integer: Attention/vigilance score
Integer: Working memory score
Integer: Verbal learning score
Integer: Visual learning score
Integer: Problem solving score
Integer: Social cognition score
Integer: Participant age in years
Factor with 2 levels: Participant sex
The dataset name has been updated to 'psych_neurocog_df' for brevity and clarity, while maintaining consistency with the naming style of the NeuroDataSets package. The suffix 'df' indicates that the dataset is a data frame.
Data taken from the heplots package version 1.7.4. Original research: Hartman, L.I. (2016) Schizophrenia and Schizoaffective Disorder: One Condition or Two? Unpublished PhD dissertation, York University.
This dataset, SAHemorrhage_df, is a data frame containing clinical and laboratory variables from 113 patients diagnosed with aneurysmal subarachnoid hemorrhage. The dataset includes functional outcomes, demographic information, clinical severity scores, and biomarker measurements. These data provide valuable information for studying neurological prognosis, biomarker associations, and clinical patterns in patients with subarachnoid hemorrhage.
data(SAHemorrhage_df)data(SAHemorrhage_df)
A data frame with 113 observations and 7 variables:
Glasgow Outcome Scale at 6 months (ordered factor with 5 levels)
Clinical outcome classification (factor with 2 levels)
Gender of the patient (factor with 2 levels)
Age of the patient (integer)
WFNS clinical grade (ordered factor with 5 levels)
S100B biomarker level (numeric)
Nucleoside diphosphate kinase A level (numeric)
The dataset name has been kept as SAHemorrhage_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the reportROC package version 3.6
This dataset, sleep_deprivation_tbl_df, is a tibble containing data from a 2024 study conducted in the Middle East that investigated the effects of sleep deprivation on cognitive performance and emotional regulation. The dataset includes 60 participants from diverse backgrounds and captures detailed information on sleep duration, sleep quality, daytime sleepiness, cognitive performance metrics (reaction times and memory accuracy), and emotional stability. Additionally, the dataset records demographic and lifestyle factors such as age, gender, BMI, caffeine intake, physical activity level, and stress level.
data(sleep_deprivation_tbl_df)data(sleep_deprivation_tbl_df)
A tibble with 60 observations and 14 variables:
Unique identifier for each participant (character)
Average hours of sleep per night (numeric)
Self-reported sleep quality score (numeric)
Level of daytime sleepiness (numeric)
Reaction time on the Stroop cognitive task (numeric)
Accuracy score on the N-Back working memory task (numeric)
Score reflecting emotional regulation ability (numeric)
Reaction time on the Psychomotor Vigilance Task (numeric)
Age of the participant in years (numeric)
Gender of the participant (character)
Body Mass Index (numeric)
Daily caffeine intake (numeric)
Self-reported physical activity level (numeric)
Self-reported stress level (numeric)
The dataset name has been kept as sleep_deprivation_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/sacramentotechnology/sleep-deprivation-and-cognitive-performance
This dataset, sleep_disorder_df, is a data frame containing polysomnographic (PSG) measurements from a clinical study designed to compare automated and semi-automated scoring methods used in the diagnosis of transient sleep disorders. The study included 82 patients who were administered a sleep-inducing drug (Zolpidem 10 mg). The primary measure of interest is the latency to persistent sleep (LPS), defined as the time from lights out to the beginning of 10 consecutive minutes of uninterrupted sleep. LPS was measured using three different scoring methods: manual, automated, and partial.
data(sleep_disorder_df)data(sleep_disorder_df)
A data frame with 82 observations and 3 variables:
Latency to persistent sleep measured using manual scoring (numeric)
Latency to persistent sleep measured using automated scoring (numeric)
Latency to persistent sleep measured using semi-automated (partial) scoring (numeric)
The dataset name has been kept as sleep_disorder_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the MVT package version 0.3-81
This dataset, sleep_performance_df, is a data frame containing measurements of the increase in slow-wave sleep and corresponding improvements in spatial learning tasks from 10 human subjects.
data(sleep_performance_df)data(sleep_performance_df)
A data frame with 10 observations and 2 variables:
Integer vector representing increase in slow-wave sleep (units)
Integer vector representing improvement in spatial learning tasks (units)
The dataset name has been kept as 'sleep_performance_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the abd package version 0.2-8
This dataset, subcortical_patterns_tbl_df, is a tibble containing expected patterns of subcortical structures in schizophrenia derived from large-scale meta-analyses by the ENIGMA consortium. It includes data from multiple neurological and psychiatric conditions for comparison.
data(subcortical_patterns_tbl_df)data(subcortical_patterns_tbl_df)
A tibble with 8 observations and 16 variables:
Character vector indicating subcortical regions
Numeric vector of expected patterns for schizophrenia spectrum disorder
Numeric vector of expected patterns for major depressive disorder
Numeric vector of expected patterns for Alzheimer's disease (ADNI cohort)
Numeric vector of expected patterns for Alzheimer's disease (ADNI+OSYRIX cohort)
Numeric vector of expected patterns for bipolar disorder
Numeric vector of expected patterns for Parkinson's disease
Numeric vector of expected patterns for diabetes
Numeric vector of expected patterns for high blood pressure
Numeric vector of expected patterns for high lipids
Numeric vector of expected patterns for metabolic syndrome
Numeric vector of expected patterns for 22q11.2 deletion syndrome
Numeric vector of expected patterns for suicide
Numeric vector of expected patterns for pediatric OCD
Numeric vector of expected patterns for adult OCD
Numeric vector of expected patterns for anorexia nervosa
The dataset name has been kept as 'subcortical_patterns_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Data taken from the RVIpkg package version 0.3.2
This dataset, TBI_age_tbl_df, is a tibble containing information from the year 2014 on traumatic brain injury (TBI) cases across different age groups. The dataset provides details on the mechanisms that caused the injuries, the type of injury, the estimated number of observed cases, and the estimated rate of cases per 100,000 people.
data(TBI_age_tbl_df)data(TBI_age_tbl_df)
A tibble with 231 observations and 5 variables:
Age group category (character)
Type of traumatic brain injury (character)
Mechanism by which the injury occurred (character)
Estimated number of observed cases in 2014 (numeric)
Estimated rate of cases per 100,000 population in 2014 (numeric)
The dataset name has been kept as TBI_age_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/jessemostipak/traumatic-brain-injury-tbi
This dataset, TBI_military_tbl_df, is a tibble containing information on traumatic brain injuries (TBI) diagnosed among U.S. military personnel. The dataset includes the service branch, military component, severity of the injury, number of diagnosed cases, and the year of observation.
data(TBI_military_tbl_df)data(TBI_military_tbl_df)
A tibble with 438 observations and 5 variables:
Branch of military service (character)
Status of the individual (active duty, reserve, or guard) (character)
Severity category of the traumatic brain injury (character)
Number of diagnosed TBI cases (numeric)
Year of recorded TBI diagnosis (numeric)
The dataset name has been kept as TBI_military_tbl_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix tbl_df indicates that the dataset is a tibble (a modern data frame). The original content has not been modified in any way. Variable names and values are provided exactly as they appear in the source.
Data taken from Kaggle: https://www.kaggle.com/datasets/jessemostipak/traumatic-brain-injury-tbi
This dataset, TBI_steroids_df, is a data frame containing data from a systematic review evaluating the effects of corticosteroids on mortality in patients with acute traumatic brain injury. The dataset includes results from randomized controlled trials, including the influential MRC CRASH trial (Roberts et al. 2001). Variables include study identifiers, numbers of deaths in the corticosteroid and control groups, and corresponding sample sizes. These data are useful for meta-analytic investigations of corticosteroid efficacy in traumatic brain injury.
data(TBI_steroids_df)data(TBI_steroids_df)
A data frame with 17 observations and 5 variables:
Study identifier (character)
Number of deaths in the corticosteroid group (numeric)
Sample size of the corticosteroid group (numeric)
Number of deaths in the control group (numeric)
Sample size of the control group (numeric)
The dataset name has been kept as TBI_steroids_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the ratesci package version 1.0.0
This dataset, tourette_ADHD_df, is a data frame containing accuracy scores from 51 adult participants grouped into three categories related to Tourette’s Syndrome and attentional dysfunction. The data include performance accuracy and group membership, allowing comparison across diagnostic groups. Some observations may contain missing values. The dataset originates from research on attentional processes in adults with Tourette’s Syndrome.
data(tourette_ADHD_df)data(tourette_ADHD_df)
A data frame with 51 observations and 2 variables:
Accuracy score (numeric)
Participant group (factor with 3 levels)
The dataset name has been kept as tourette_ADHD_df to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the NeuroDataSets package and assists users in identifying its specific characteristics. The suffix df indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the rcollectadhd package version 0.8
This function lists all datasets available in the 'NeuroDataSets' package. If the 'NeuroDataSets' package is not loaded, it stops and shows an error message. If no datasets are available, it returns a message and an empty vector.
view_datasets_NeuroDataSets()view_datasets_NeuroDataSets()
A character vector with the names of the available datasets. If no datasets are found, it returns an empty character vector.
if (requireNamespace("NeuroDataSets", quietly = TRUE)) { library(NeuroDataSets) view_datasets_NeuroDataSets() }if (requireNamespace("NeuroDataSets", quietly = TRUE)) { library(NeuroDataSets) view_datasets_NeuroDataSets() }
This dataset, WMpatterns_tbl_df, is a tibble containing expected patterns of white matter in schizophrenia derived from large-scale meta-analyses by the ENIGMA consortium. It includes data from multiple neurological and psychiatric conditions for comparison.
data(WMpatterns_tbl_df)data(WMpatterns_tbl_df)
A tibble with 24 observations and 15 variables:
Character vector indicating white matter regions
Numeric vector of expected patterns for schizophrenia spectrum disorder
Numeric vector of expected patterns for major depressive disorder
Numeric vector of expected patterns for Alzheimer's disease (ADNI cohort)
Numeric vector of expected patterns for Alzheimer's disease (ADNI+OSYRIX cohort)
Numeric vector of expected patterns for bipolar disorder
Numeric vector of expected patterns for diabetes
Numeric vector of expected patterns for high blood pressure
Numeric vector of expected patterns for high lipids
Numeric vector of expected patterns for metabolic syndrome
Numeric vector of expected patterns for 22q11.2 deletion syndrome
Numeric vector of expected patterns for post-traumatic stress disorder
Numeric vector of expected patterns for traumatic brain injury
Numeric vector of expected patterns for pediatric OCD
Numeric vector of expected patterns for adult OCD
The dataset name has been changed from 'white_matter_patterns_tbl_df' to 'WMpatterns_tbl_df' to follow the shorter naming convention adopted for the NeuroDataSets package while maintaining clarity. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Data taken from the RVIpkg package version 0.3.2