Title: | Comprehensive Medical, Disease, Treatment, and Drug Datasets |
---|---|
Description: | Provides an extensive collection of datasets related to medicine, diseases, treatments, drugs, and public health. This package covers topics such as drug effectiveness, vaccine trials, survival rates, infectious disease outbreaks, and medical treatments. The included datasets span various health conditions, including AIDS, cancer, bacterial infections, and COVID-19, along with information on pharmaceuticals and vaccines. These datasets are sourced from the R ecosystem and other R packages, remaining unaltered to ensure data integrity. This package serves as a valuable resource for researchers, analysts, and healthcare professionals interested in conducting medical and public health data analysis in R. |
Authors: | Renzo Caceres Rossi [aut, cre] |
Maintainer: | Renzo Caceres Rossi <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-10-25 05:32:26 UTC |
Source: | https://github.com/lightbluetitan/meddatasets |
The dataset name has been changed to 'Aids2_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(Aids2_df)
data(Aids2_df)
A data frame with 2843 observations and 7 variables:
A factor indicating the state of residence of the patient (4 levels).
A factor indicating the sex of the patient (2 levels).
An integer indicating the year of diagnosis.
An integer indicating the year of death.
A factor indicating the status of the patient (2 levels: alive or deceased).
A factor indicating the T-cell category of the patient (8 levels).
An integer indicating the age of the patient at diagnosis.
This dataset provides information on the survival rates and characteristics of AIDS patients in Australia, including their demographic details and clinical data.
Australian Department of Health.
The dataset name has been changed to 'anorexia_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(anorexia_df)
data(anorexia_df)
A data frame with 72 observations and 3 variables:
A factor indicating the treatment group (with 3 possible levels).
A numeric value representing the weight of the patient before treatment (in pounds).
A numeric value representing the weight of the patient after treatment (in pounds).
This dataset contains information on weight changes among patients diagnosed with anorexia, including their treatment and weight measurements before and after treatment.
Data collected from clinical studies on anorexia treatment and weight change.
The dataset name has been changed to 'antibiotics_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(antibiotics_tbl_df)
data(antibiotics_tbl_df)
A tibble with 92 observations and 1 variable:
A factor indicating the pre-existing condition of the children (with 8 possible levels).
This dataset contains information about pre-existing conditions in 92 children, providing insights into the prevalence of different conditions in the sample.
Data collected from a clinical study assessing children's health conditions.
The dataset name has been changed to 'avandia_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(avandia_tbl_df)
data(avandia_tbl_df)
A tibble with 227,571 observations and 2 variables:
A factor indicating the type of diabetes medicine (with 2 possible levels).
A factor indicating the presence of cardiovascular problems (with 2 possible levels).
This dataset contains information on cardiovascular problems associated with two types of diabetes medicines, providing insights into the safety and efficacy of these treatments.
Data collected from clinical trials assessing the cardiovascular effects of diabetes medications.
The dataset name has been changed to 'babies_crawl_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(babies_crawl_tbl_df)
data(babies_crawl_tbl_df)
A tibble with 12 observations and 5 variables:
A factor indicating the month of birth (January to December).
A numeric value representing the average crawling age (in months).
A numeric value indicating the standard deviation of crawling age.
An integer representing the number of infants included in the calculation.
An integer indicating the average temperature (in degrees Celsius) during the month.
This dataset contains information on the average crawling age of infants based on the month of birth, as well as associated factors such as temperature.
Data collected on the crawling age of infants based on birth month and temperature.
The dataset name has been changed to 'babies_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(babies_tbl_df)
data(babies_tbl_df)
A tibble with 1,236 observations and 8 variables:
An integer indicating the case number.
An integer representing the birth weight of the infant (in grams).
An integer indicating the gestation period (in weeks).
An integer representing the number of previous births.
An integer indicating the age of the mother (in years).
An integer indicating the height of the mother (in cm).
An integer indicating the weight of the mother (in kg).
An integer indicating whether the mother smoked during pregnancy (1 = yes, 0 = no).
This dataset contains information from the Child Health and Development Studies, focusing on various factors associated with infant health outcomes.
Data collected from the Child Health and Development Studies.
The dataset name has been changed to 'bacteria_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(bacteria_df)
data(bacteria_df)
A data frame with 220 observations and 6 variables:
A factor indicating the presence (1) or absence (0) of bacteria.
A factor indicating the result of an antibiotic susceptibility test (yes/no).
A factor indicating a high or low bacterial load (high/low).
An integer representing the week of treatment.
A factor representing the unique identifier for each patient (with 50 possible levels).
A factor indicating the treatment group (with 3 possible levels).
This dataset contains information on the presence of bacteria in patients after receiving various drug treatments, including the treatment group and duration of treatment.
Data collected from clinical trials on antibiotic treatments and their effects on bacterial presence.
The dataset name has been changed to 'bdims_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(bdims_tbl_df)
data(bdims_tbl_df)
A tibble with 507 observations and 25 variables:
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
Numerical value of body measurement (in cm).
An integer indicating the age of the individual (in years).
A numeric value indicating the weight of the individual (in kg).
A numeric value indicating the height of the individual (in cm).
An integer indicating the sex of the individual (1 = male, 2 = female).
This dataset contains body measurements of 507 physically active individuals, including various dimensions and physical attributes.
Data collected from physically active individuals to analyze body measurements.
The dataset name has been changed to 'biontech_teens_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(biontech_teens_tbl_df)
data(biontech_teens_tbl_df)
A tibble with 2,260 observations and 2 variables:
A factor indicating the group (e.g., vaccinated vs. unvaccinated).
A factor indicating the outcome (e.g., infection status).
This dataset contains information on the efficacy of the Pfizer-BioNTech COVID-19 vaccine among adolescents, detailing the outcomes based on different groups.
Data collected during clinical trials to evaluate vaccine efficacy in adolescents.
The dataset name has been changed to 'birthwt_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(birthwt_df)
data(birthwt_df)
A data frame with 189 observations and 10 variables:
An integer indicating whether the infant's birth weight is low (1) or not (0).
An integer representing the age of the mother (in years).
An integer indicating the mother's weight at last menstrual period (in pounds).
An integer indicating the race of the mother (coded as 1, 2, or 3).
An integer indicating whether the mother smoked during pregnancy (1 for yes, 0 for no).
An integer indicating the number of premature labors.
An integer indicating whether the mother had a history of hypertension (1 for yes, 0 for no).
An integer indicating whether the mother had a history of uterine irritability (1 for yes, 0 for no).
An integer indicating the number of physician visits during the first trimester.
An integer representing the infant's birth weight (in grams).
This dataset contains information on risk factors associated with low infant birth weight, including maternal characteristics and behaviors during pregnancy.
Data collected from maternal health studies related to infant birth weight.
The dataset name has been changed to 'colmozzie_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(colmozzie_tbl_df)
data(colmozzie_tbl_df)
A tibble with 279 observations and 12 variables:
Number of dengue cases reported during the week (integer).
Year of the reported cases (integer).
Week of the year (integer).
Average temperature (numeric).
Maximum temperature recorded (numeric).
Minimum temperature recorded (numeric).
Sea level pressure (character).
Humidity levels (numeric).
Precipitation levels (numeric).
Wind velocity (numeric).
Another wind variable (numeric).
Yet another wind variable (numeric).
This dataset contains weekly reported cases of dengue fever in Sri Lanka, along with various meteorological variables that may be associated with the incidence of the disease.
The dataset is based on public health records and meteorological data from Sri Lanka.
The dataset name has been changed to 'covid19sf_hospital_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(covid19sf_hospital_df)
data(covid19sf_hospital_df)
A data frame with 4,514 observations and 5 variables:
The name of the hospital (character).
The date of the reported data (Date).
The type of bed (character), such as ICU, general, etc.
The status of the beds (character), indicating if they are occupied, available, etc.
The number of beds reported (integer).
This dataset provides information on hospital capacity in San Francisco during the COVID-19 pandemic. It details the number of available hospital beds categorized by type and status, along with the respective hospitals and dates. The dataset is crucial for understanding the hospital system's response and capacity to handle COVID-19 cases.
San Francisco Department of Public Health COVID-19 hospital capacity data.
The dataset name has been changed to 'covid19sf_tests_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(covid19sf_tests_df)
data(covid19sf_tests_df)
A data frame with 652 observations and 6 variables:
The date when the specimen was collected (Date).
The total number of tests conducted (integer).
The number of positive test results (integer).
The percentage of positive tests (numeric).
The number of negative test results (integer).
The number of indeterminate test results (integer).
This dataset contains information on COVID-19 tests conducted in San Francisco, detailing the number of tests performed, the number of positive and negative results, as well as other related metrics. It provides insights into the testing patterns and results during the COVID-19 pandemic.
San Francisco Department of Public Health COVID-19 testing data.
The dataset name has been changed to 'Cushings_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(Cushings_df)
data(Cushings_df)
A data frame with 27 observations and 3 variables:
A numeric vector representing the levels of Tetrahydrocortisone.
A numeric vector representing the levels of Pregnanetriol.
A factor indicating the type of test conducted (4 levels).
This dataset contains results from diagnostic tests conducted on patients suspected of having Cushing's Syndrome, focusing on the measurement of specific hormonal metabolites.
Data collected from clinical trials and patient records.
The dataset name has been changed to 'diabetes2_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(diabetes2_tbl_df)
data(diabetes2_tbl_df)
A tibble with 699 observations and 2 variables:
A factor indicating the type of treatment administered (e.g., different medication types).
A factor indicating the outcome of the treatment (e.g., improvement or no improvement).
This dataset contains information from a clinical trial focusing on Type 2 diabetes in patients aged 10 to 17 years, detailing the treatment approaches and their outcomes.
Data collected from clinical trials assessing treatment efficacy for Type 2 diabetes in adolescents.
The dataset name has been changed to 'drugbank_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(drugbank_df)
data(drugbank_df)
A data frame with 27,728 observations and 2 variables:
Gene associated with the disease (factor).
Disease associated with the gene (factor).
This dataset contains information about the relationships between genes and diseases, providing insights into how specific genes may be associated with various diseases.
The dataset is derived from drug interaction databases and gene-disease relationships.
The dataset name has been changed to 'ebola_survey_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(ebola_survey_tbl_df)
data(ebola_survey_tbl_df)
A tibble with 1,042 observations and 1 variable:
A factor indicating the responses related to quarantine measures (e.g., yes or no).
This dataset contains survey responses regarding quarantine measures during the Ebola outbreak, focusing on public perceptions and behaviors.
Data collected from surveys conducted during the Ebola outbreak to assess public sentiment towards quarantine.
The dataset name has been changed to 'edgar_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(edgar_df)
data(edgar_df)
A data frame with 1,038,340 observations and 2 variables:
Gene associated with the disease (factor).
Disease associated with the gene (factor).
This dataset contains information about the relationships between genes and diseases, specifically focusing on data sourced from the Edgar database, providing insights into how specific genes may be associated with various diseases.
The dataset is derived from the Edgar database and focuses on gene-disease relationships.
The dataset name has been changed to 'esoph_df' to avoid confusion with datasets from other packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(esoph_df)
data(esoph_df)
A data frame with 88 observations and 5 variables:
Age group of the individuals (ordered factor).
Alcohol consumption group (ordered factor).
Tobacco consumption group (ordered factor).
Number of cases (numeric).
Number of controls (numeric).
This dataset contains data from a case-control study investigating the association between smoking, alcohol consumption, and esophageal cancer. It includes the number of cancer cases and controls for various age, alcohol consumption, and smoking groups.
Data from a case-control study on esophageal cancer.
The dataset name has been changed to 'fdeaths_ts' to avoid confusion with datasets from other packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_ts' indicates that this dataset is a time series object, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(fdeaths_ts)
data(fdeaths_ts)
A time series object with 72 observations, from 1974 to 1980:
A numeric vector representing the number of monthly deaths due to lung diseases in females.
This dataset contains the number of monthly deaths from lung diseases in the UK, specifically for females, from 1974 to 1980.
UK Health Authority data on lung disease deaths.
The dataset name has been changed to 'GAGurine_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(GAGurine_df)
data(GAGurine_df)
A data frame with 314 observations and 2 variables:
A numeric vector representing the age of the children in years.
A numeric vector representing the levels of glycosaminoglycans (GAG) in urine (in some appropriate unit).
This dataset contains measurements of glycosaminoglycan (GAG) levels in urine samples collected from children, aimed at studying potential health implications associated with abnormal GAG levels.
Data collected from pediatric health studies.
The dataset name has been changed to 'gehan_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(gehan_df)
data(gehan_df)
A data frame with 42 observations and 4 variables:
An integer representing the patient pair identifier.
An integer indicating the time to remission (in days).
An integer indicating whether the observation was censored (1 for censored, 0 for not censored).
A factor indicating the treatment group (with 2 possible levels).
This dataset contains information on the remission times of leukaemia patients, including treatment information and whether the remission time was censored.
Data collected from clinical studies on leukaemia treatment and remission.
The dataset name has been changed to 'genotype_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(genotype_df)
data(genotype_df)
A data frame with 61 observations and 3 variables:
A factor indicating the litter group (with 4 possible levels).
A factor indicating the mother of the rats (with 4 possible levels).
A numeric value representing the weight of the rats (in grams).
This dataset contains genotype data from rats, including information on litter, maternal lineage, and weight measurements.
Data collected from genetic studies involving rat populations.
The dataset name has been changed to 'heart_transplant_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(heart_transplant_tbl_df)
data(heart_transplant_tbl_df)
A tibble with 103 observations and 8 variables:
An integer identifier for each patient.
The year the patient was accepted for transplantation.
The age of the patient at the time of transplantation.
A factor indicating whether the patient survived post-transplant (e.g., yes or no).
The time (in months) the patient survived after the transplant.
A factor indicating whether the patient had prior heart conditions (e.g., yes or no).
A factor indicating the type of transplant (e.g., heart only, heart-lung).
The wait time (in days) for the transplant.
This dataset contains information on heart transplant patients, including demographics, survival outcomes, and wait times.
Data collected from heart transplant records to study patient outcomes and factors influencing survival.
The dataset name has been changed to 'infert_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(infert_df)
data(infert_df)
A data frame with 248 observations and 8 variables:
A factor representing the education level of the subjects, with 3 levels.
A numeric vector indicating the age of the subjects.
A numeric vector representing the number of previous pregnancies.
A numeric vector indicating the number of induced abortions.
A numeric vector indicating the case status (infertile or not).
A numeric vector indicating the number of spontaneous abortions.
An integer representing the stratum in the study.
A numeric vector representing the pooled stratum values.
This dataset examines the relationship between various factors such as education level, age, parity, and the incidence of infertility after spontaneous and induced abortion.
Data collected from clinical studies on infertility.
The dataset name has been changed to 'ldeaths_ts' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_ts' indicates that this dataset is a time series, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(ldeaths_ts)
data(ldeaths_ts)
A time series object with 72 observations:
A numeric vector containing the number of monthly deaths from lung diseases in the UK.
This dataset provides information on the monthly deaths from lung diseases in the UK, recorded from 1974 to 1980.
Office for National Statistics, UK.
The dataset name has been changed to 'leuk_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(leuk_df)
data(leuk_df)
A data frame with 33 observations and 3 variables:
An integer representing the white blood cell count (in thousands per microliter).
A factor indicating the treatment group (with 2 possible levels).
An integer indicating the survival time (in days).
This dataset contains survival times and white blood cell counts for leukaemia patients, providing insights into the relationship between blood counts and survival outcomes.
Data collected from clinical studies on leukaemia patients.
The dataset name has been changed to 'mala_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(mala_df)
data(mala_df)
A data frame with 241,306 observations and 2 variables:
Disease associated with the gene (factor).
Gene associated with the disease (factor).
This dataset contains information about the relationships between genes and diseases, providing insights into how specific genes are associated with various diseases. It offers a comprehensive view of gene-disease associations.
The dataset contains gene-disease relationship data from various scientific studies and databases.
The dataset name has been changed to 'malaria_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(malaria_tbl_df)
data(malaria_tbl_df)
A tibble with 20 observations and 2 variables:
A factor indicating the type of treatment administered (e.g., vaccine or placebo).
A factor indicating the outcome of the treatment (e.g., success or failure).
This dataset contains information from a malaria vaccine trial, focusing on the treatment administered and the outcomes observed in the participants.
Data collected from a clinical trial assessing the efficacy of a malaria vaccine.
The dataset name has been changed to 'mdeaths_ts' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_ts' indicates that this dataset is a time series, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(mdeaths_ts)
data(mdeaths_ts)
A time series object with 72 observations:
A numeric vector containing the number of monthly deaths from lung diseases in the UK.
This dataset provides information on the monthly deaths from lung diseases in the UK, recorded from 1974 to 1980.
Office for National Statistics, UK.
The MedDataSets package provides an extensive collection of datasets related to medicine, diseases, treatments, drugs, and public health It covers topics such as drug effectiveness, vaccine trials, survival rates, infectious disease outbreaks, and medical treatments.
MedDataSets: Comprehensive Medical, Disease, Treatment, and Drug Datasets
Comprehensive Medical, Disease, Treatment, and Drug Datasets
Maintainer: Renzo Caceres Rossi [email protected]
Useful links:
The dataset name has been changed to 'Melanoma_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(Melanoma_df)
data(Melanoma_df)
A data frame with 205 observations and 7 variables:
An integer representing the survival time of the patients (in months).
An integer indicating the status of the patient at the end of the study; typically coded as 1 for dead and 0 for alive.
An integer representing the sex of the patient; usually coded as 1 for male and 0 for female.
An integer indicating the age of the patient at diagnosis (in years).
An integer representing the year of diagnosis.
A numeric value indicating the thickness of the melanoma (in millimeters).
An integer indicating the presence of ulceration; usually coded as 1 for yes and 0 for no.
This dataset contains information on survival rates of patients diagnosed with malignant melanoma, including various clinical factors that may affect prognosis.
Data collected from clinical studies on malignant melanoma.
The dataset name has been changed to 'Mixtures_Drug_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(Mixtures_Drug_tbl_df)
data(Mixtures_Drug_tbl_df)
A tibble with 819 observations and 3 variables:
Name of the drug mixture (character).
Ingredients that make up the drug mixture (character).
Identifier linking the mixture to its parent compound or category (character).
This dataset provides information about various drug mixtures, including their names and ingredients.
The dataset is derived from publicly available pharmaceutical databases and research studies.
The dataset name has been changed to 'muscle_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(muscle_df)
data(muscle_df)
A data frame with 60 observations and 3 variables:
A factor indicating the specific muscle strip (with 21 possible levels).
A numeric value representing the concentration of calcium chloride (in mM).
A numeric value indicating the length of the muscle strip (in millimeters).
This dataset contains experimental data on the effect of calcium chloride on muscle contraction in rat hearts, including measurements of muscle strip length and calcium concentration.
Data collected from experiments assessing the impact of calcium chloride on muscle contraction.
The dataset name has been changed to 'Pima_te_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(Pima_te_df)
data(Pima_te_df)
A data frame with 332 observations and 8 variables:
An integer representing the number of pregnancies.
An integer indicating the plasma glucose concentration (mg/dL) 2 hours after an oral glucose tolerance test.
An integer representing the diastolic blood pressure (mm Hg).
An integer indicating the skin thickness (mm).
A numeric value indicating the body mass index (BMI).
A numeric value representing the diabetes pedigree function.
An integer indicating the age of the individual (in years).
A factor indicating whether the individual has diabetes (1) or not (0).
This dataset contains medical examination data for Pima Indian women, including various health metrics that may be related to diabetes.
Data collected from medical examinations of Pima Indian women.
The dataset name has been changed to 'Pima_tr_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(Pima_tr_df)
data(Pima_tr_df)
A data frame with 200 observations and 8 variables:
An integer representing the number of pregnancies.
An integer indicating the plasma glucose concentration (mg/dL) 2 hours after an oral glucose tolerance test.
An integer representing the diastolic blood pressure (mm Hg).
An integer indicating the skin thickness (mm).
A numeric value indicating the body mass index (BMI).
A numeric value representing the diabetes pedigree function.
An integer indicating the age of the individual (in years).
A factor indicating whether the individual has diabetes (1) or not (0).
This dataset contains medical examination data for Pima Indian women, including various health metrics that may be related to diabetes.
Data collected from medical examinations of Pima Indian women.
The dataset name has been changed to 'Pima_tr2_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(Pima_tr2_df)
data(Pima_tr2_df)
A data frame with 300 observations and 8 variables:
An integer representing the number of pregnancies.
An integer indicating the plasma glucose concentration (mg/dL) 2 hours after an oral glucose tolerance test.
An integer representing the diastolic blood pressure (mm Hg).
An integer indicating the skin thickness (mm).
A numeric value indicating the body mass index (BMI).
A numeric value representing the diabetes pedigree function.
An integer indicating the age of the individual (in years).
A factor indicating whether the individual has diabetes (1) or not (0).
This dataset contains medical examination data for Pima Indian women, including various health metrics that may be related to diabetes.
Data collected from medical examinations of Pima Indian women.
The dataset name has been changed to 'Puromycin_df' to avoid confusion with datasets from other packages in the R ecosystem and to follow the naming convention in the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, distinguishing it both within the package and from similar datasets in other R packages. The original content of the dataset has not been modified in any way.
data(Puromycin_df)
data(Puromycin_df)
A data frame with 23 observations and 3 variables:
Substrate concentration (numeric).
Reaction velocity (numeric).
A factor with two levels: treated
and untreated
, indicating whether Puromycin was present.
The dataset contains additional metadata:
The reference for this dataset: "A1.3, p. 269"
.
This dataset contains data from an experiment on the reaction velocity of an enzymatic reaction with and without the use of Puromycin, an antibiotic that inhibits protein synthesis.
Experimental data on the effects of Puromycin on enzyme reaction rates.
The dataset name has been changed to 'sinusitis_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(sinusitis_tbl_df)
data(sinusitis_tbl_df)
A tibble with 166 observations and 2 variables:
A factor indicating the treatment group (e.g., antibiotic vs. placebo).
A factor indicating the participants' self-reported improvement (e.g., yes or no).
This dataset contains information from an experiment assessing the effects of antibiotics on patients with sinusitis, focusing on the group assignments and the self-reported improvement outcomes observed in the participants.
Data collected from a clinical trial investigating the efficacy of antibiotics in treating sinusitis.
The dataset name has been changed to 'sleep_deprivation_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(sleep_deprivation_tbl_df)
data(sleep_deprivation_tbl_df)
A tibble with 1,087 observations and 2 variables:
A factor indicating the reported sleep deprivation level (e.g., low, moderate, high).
A factor indicating the profession of the participants (e.g., truck driver, pilot, etc.).
This dataset contains information from a survey conducted on transportation workers, focusing on the relationship between sleep deprivation and their professional roles. It includes variables on the amount of sleep reported and the professions of the respondents.
Data collected from a survey targeting transportation workers to assess the impact of sleep deprivation on their performance and well-being.
The dataset name has been changed to 'smallpox_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(smallpox_tbl_df)
data(smallpox_tbl_df)
A tibble with 6,224 observations and 2 variables:
A factor indicating the outcome of the vaccination (e.g., successful, unsuccessful).
A factor indicating whether the individual was inoculated with the smallpox vaccine.
This dataset contains the results of a study on the efficacy of the smallpox vaccine. It includes information on the vaccination outcomes of individuals who were inoculated, providing insight into the effectiveness of the vaccine in preventing the disease.
Data collected from studies on smallpox vaccination and its outcomes.
The dataset name has been changed to 'smoking_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(smoking_tbl_df)
data(smoking_tbl_df)
A tibble with 1,691 observations and 12 variables:
A factor indicating the gender of the respondent (e.g., male, female).
An integer representing the age of the respondent.
A factor indicating the marital status of the respondent (e.g., single, married).
A factor indicating the highest qualification attained by the respondent.
A factor indicating the nationality of the respondent.
A factor indicating the ethnicity of the respondent.
A factor indicating the gross income level of the respondent.
A factor indicating the geographical region of the respondent.
A factor indicating whether the respondent is a smoker (e.g., yes, no).
An integer representing the amount smoked on weekends.
An integer representing the amount smoked on weekdays.
A factor indicating the type of smoking behavior (e.g., occasional, regular).
This dataset contains information on smoking habits in the UK, focusing on various demographic factors and smoking behaviors. It provides insights into smoking patterns among different groups of people, helping to inform public health strategies.
Data collected from UK health surveys focusing on smoking behavior.
The dataset name has been changed to 'stent30_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(stent30_tbl_df)
data(stent30_tbl_df)
A tibble with 451 observations and 2 variables:
A factor indicating the treatment group (e.g., stent vs. control).
A factor indicating the outcome of the treatment (e.g., success or failure).
This dataset contains information regarding the use of stents for the treatment of stroke, focusing on the group assignments and the outcomes observed in the participants.
Data collected from a clinical trial assessing the efficacy of stents in stroke treatment.
The dataset name has been changed to 'ToothGrowth_df' to avoid confusion with datasets from other packages in the R ecosystem and to align with the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from datasets in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(ToothGrowth_df)
data(ToothGrowth_df)
A data frame with 60 observations and 3 variables:
Tooth length (numeric).
Type of supplement: either "VC" (Vitamin C) or "OJ" (Orange Juice) (factor).
Dose of Vitamin C administered in milligrams per day (numeric).
This dataset explores the effect of Vitamin C on tooth growth in guinea pigs. It includes data on tooth length as a response to different doses of Vitamin C, administered through two delivery methods.
Experimental data on the effect of Vitamin C on tooth growth in guinea pigs.
The dataset name has been changed to 'transplant_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(transplant_tbl_df)
data(transplant_tbl_df)
A tibble with 62 observations and 1 variable:
A factor indicating the outcome of the transplant procedure (e.g., success, failure).
This dataset contains fake data representing the success rates of transplant consultants. It provides insights into the outcomes of transplant procedures performed by different consultants, useful for evaluating consultant performance and patient outcomes.
Synthetic data created for educational purposes.
The dataset name has been changed to 'VA_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(VA_df)
data(VA_df)
A data frame with 137 observations and 8 variables:
A numeric value representing the survival time (in days).
A numeric value indicating the status of the patient (1 if the patient died, 0 otherwise).
A factor indicating the treatment group (e.g., treatment A or B).
A numeric value representing the age of the patient (in years).
A numeric value representing the Karnofsky performance status score.
A numeric value indicating the time since diagnosis (in days).
A factor indicating the cell type of the lung cancer (with 4 possible levels).
A factor indicating prior treatment (yes/no).
This dataset contains data from the Veteran's Administration Lung Cancer Trial, which includes information on patients diagnosed with lung cancer, their treatment, and other relevant variables.
Data collected from the Veteran's Administration Lung Cancer Trial.
The dataset name has been changed to 'VADeaths_matrix' to avoid confusion with datasets from other packages in the R ecosystem and to align with the naming conventions of the 'MedDataSets' package. The suffix '_matrix' indicates that this dataset is a matrix, helping to distinguish it from other datasets within the package and from datasets in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(VADeaths_matrix)
data(VADeaths_matrix)
A matrix with 5 rows and 4 columns:
Death rates for Rural Male (numeric).
Death rates for Rural Female (numeric).
Death rates for Urban Male (numeric).
Death rates for Urban Female (numeric).
Age groups: 50-54, 55-59, 60-64, 65-69, 70-74.
This dataset contains death rates per 1,000 individuals in various population groups in Virginia in 1940, classified by age group and rural/urban residency.
U.S. Census Bureau, Virginia (1940) Death Records.
The dataset name has been changed to 'wtloss_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_df' indicates that this dataset is a data frame, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(wtloss_df)
data(wtloss_df)
A data frame with 52 observations and 2 variables:
An integer representing the number of days in the weight loss program.
A numeric value indicating the weight of the patient (in kilograms).
This dataset contains weight loss data from an obese patient, detailing the weight changes over a specified number of days during a weight loss program.
Data collected from a clinical study on weight loss in obese patients.
The dataset name has been changed to 'yawn_tbl_df' to avoid confusion with other datasets from packages in the R ecosystem and to follow the naming conventions of the 'MedDataSets' package. The suffix '_tbl_df' indicates that this dataset is a tibble, helping to distinguish it from other datasets within the package and from those in the broader R ecosystem. The original content of the dataset has not been modified in any way.
data(yawn_tbl_df)
data(yawn_tbl_df)
A tibble with 50 observations and 2 variables:
A factor indicating the result of the yawning observation (e.g., yawned, did not yawn).
A factor representing the group to which the participants belong (e.g., control, experimental).
This dataset investigates the contagiousness of yawning. It includes results from an experiment that examines whether individuals yawn more when they are in the presence of someone else who is yawning, providing insights into social behaviors and contagion phenomena.
Data collected from a study on yawning contagion.