| Title: | Access Infectious and Epidemiological Data via 'disease.sh API' |
|---|---|
| Description: | Provides functions to access real-time infectious disease data from the 'disease.sh API', including COVID-19 global, US states, continent, and country statistics, vaccination coverage, influenza-like illness data from the Centers for Disease Control and Prevention (CDC), and more. Also includes curated datasets on a variety of infectious diseases such as influenza, measles, dengue, Ebola, tuberculosis, meningitis, AIDS, and others. The package supports epidemiological research and data analysis by combining API access with high-quality historical and survey datasets on infectious diseases. For more details on the 'disease.sh API', see <https://disease.sh/>. |
| Authors: | Renzo Caceres Rossi [aut, cre] |
| Maintainer: | Renzo Caceres Rossi <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.2 |
| Built: | 2026-05-15 08:49:08 UTC |
| Source: | https://github.com/lightbluetitan/infectiousr |
This dataset, active_hepatitis_df, is a data frame containing information from a clinical trial of 44 patients with chronic active hepatitis. Patients were randomized to receive either the drug prednisolone or no treatment (control group).
data(active_hepatitis_df)data(active_hepatitis_df)
A data frame with 44 observations and 3 variables:
Integer vector indicating treatment group: 1 for prednisolone, 0 for control
Integer vector representing the time to event or censoring (in days)
Integer vector indicating status: 1 for death, 0 for censored
The dataset name has been kept as 'active_hepatitis_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the collett package version 0.1.0
This dataset, aids_azt_df, is a data frame containing cross-classified counts of AIDS symptoms and AZT use by race of the patients, as reported in a 1991 New York Times article.
data(aids_azt_df)data(aids_azt_df)
A data frame with 4 observations and 4 variables:
Numeric vector indicating the number of patients showing AIDS symptoms
Numeric vector indicating the number of patients not showing AIDS symptoms
Factor with 2 levels indicating AZT use (yes, no)
Factor with 2 levels indicating patient race (white, black)
The dataset name has been kept as 'aids_azt_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the cond package version 1.2-4
This dataset, bcg_vaccine_df, is a data frame containing results from 13 studies examining the effectiveness of the Bacillus Calmette-Guerin (BCG) vaccine against tuberculosis.
data(bcg_vaccine_df)data(bcg_vaccine_df)
A data frame with 13 observations and 9 variables:
Integer identifier for each study
Character vector indicating the lead author of each study
Integer year in which the study was published
Integer count of tuberculosis cases in the treatment group
Integer count of non-cases in the treatment group
Integer count of tuberculosis cases in the control group
Integer count of non-cases in the control group
Integer representing absolute latitude of study location
Character string describing the method of allocation
The dataset name has been kept as 'bcg_vaccine_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the metadat package version 1.4-0
This dataset, campy_infections_ts, is a time series object containing the number of cases of campylobacter infections in the north of the province Quebec (Canada) in four week intervals from January 1990 to the end of October 2000. It contains 13 observations per year and 140 observations in total.
data(campy_infections_ts)data(campy_infections_ts)
A time series object of class ts with 140 observations, frequency 13,
starting from 1990 to 2000 (end of October).
The dataset name has been kept as 'campy_infections_ts' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'ts' indicates that the dataset is a time series object. The original content has not been modified in any way.
Data taken from the tscount package version 1.4.3. Original study: Ferland, R., Latour, A. and Oraichi, D. (2006) Integer-valued GARCH process. Journal of Time Series Analysis 27(6), 923–942.
This dataset, china_dengue_tbl_df, is a tibble containing annual records of indigenous and imported dengue cases in mainland China from 2005 to 2020.
data(china_dengue_tbl_df)data(china_dengue_tbl_df)
A tibble with 16 observations and 5 variables:
Integer year of observation (2005–2020)
Numeric vector of indigenous dengue cases
Numeric vector of imported dengue cases
Numeric vector of counties with reported indigenous dengue fever
Numeric vector of counties with reported imported dengue fever
The dataset name has been kept as 'china_dengue_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Data taken from the denguedatahub package version 2.1.1
This dataset, contagious_diseases_df, is a data frame containing yearly counts
for Hepatitis A, Measles, Mumps, Pertussis, Polio, Rubella, and Smallpox for US states.
The original data is courtesy of the Tycho Project.
data(contagious_diseases_df)data(contagious_diseases_df)
A data frame with 16,065 observations and 6 variables:
Factor with 7 levels indicating the disease type
Factor with 51 levels indicating the US state
Numeric vector indicating the year of observation
Numeric vector indicating the number of weeks reported
Numeric vector indicating the number of cases reported
Numeric vector indicating the population of the state in that year
The dataset name has been kept as contagious_diseases_df to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
infectiousR package and assists users in identifying its specific characteristics.
The suffix _df indicates that the dataset is a data frame. The original content has not been modified
in any way.
Data taken from the dslabs package version 0.8.0. Original data courtesy of the Tycho Project (http://www.tycho.pitt.edu/).
This dataset, covid_mortality_df, is a data frame containing several effect
estimates () and their standard errors for the impact of cardiovascular
disease on the mortality of COVID-19 reported in the literature.
data(covid_mortality_df)data(covid_mortality_df)
A data frame with 6 observations and 3 variables:
Character vector with the name or reference of each study
Numeric vector representing the estimated effect size ()
Numeric vector representing the standard error associated with each estimate
The dataset name has been kept as covid_mortality_df to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
infectiousR package and assists users in identifying its specific characteristics.
The suffix _df indicates that the dataset is a data frame. The original content has not been modified
in any way.
Data taken from the PRP package version 0.1.1
This dataset, covid_new_york_df, is a data frame containing daily proportions of COVID-19 cases, hospitalizations, and deaths by borough in New York City through 2020-06-30.
data(covid_new_york_df)data(covid_new_york_df)
A data frame with 615 observations and 5 variables:
Date of observation
Character vector indicating the borough (e.g., Manhattan, Bronx, etc.)
Integer vector representing the number of reported COVID-19 cases
Integer vector representing the number of hospitalizations
Integer vector representing the number of deaths
The dataset name has been kept as 'covid_new_york_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the incidental package version 0.1
This dataset, covid_severity_df, is a data frame containing several effect
estimates () and their standard errors for the impact of cardiovascular
disease on the severe case rate of COVID-19 as reported in the literature.
data(covid_severity_df)data(covid_severity_df)
A data frame with 6 observations and 3 variables:
Character vector with the name or reference of each study
Numeric vector representing the estimated effect size ()
Numeric vector representing the standard error associated with each estimate
The dataset name has been kept as covid_severity_df to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
infectiousR package and assists users in identifying its specific characteristics.
The suffix _df indicates that the dataset is a data frame. The original content has not been modified
in any way.
Data taken from the PRP package version 0.1.1
This dataset, diphtheria_philly_df, is a data frame containing the weekly incidence of diphtheria in Philadelphia between 1914 and 1947.
data(diphtheria_philly_df)data(diphtheria_philly_df)
A data frame with 1774 observations and 4 variables:
Integer vector representing the year of observation (1914–1947)
Integer vector representing the epidemiological week (1–52)
Integer vector representing the weekly incidence of diphtheria in Philadelphia
Numeric vector representing the continuous time index
The dataset name has been kept as 'diphtheria_philly_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the epimdr package version 0.6-5
This dataset, ebola_cases_df, is a data frame containing daily time series counts of new individuals exhibiting clinical signs of Ebola virus disease, as well as the number of daily removals (e.g., deaths or recoveries), during the 1995 Ebola epidemic in the Democratic Republic of Congo (DRC).
data(ebola_cases_df)data(ebola_cases_df)
A data frame with 192 observations and 3 variables:
Integer indicating the number of days since the beginning of observation
Integer indicating the number of new individuals with clinical signs of Ebola
Integer indicating the number of new removals (e.g., deaths or recoveries)
The dataset name has been kept as 'ebola_cases_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the SimBIID package version 0.2.2
This dataset, ebola_sleone_df, is a data frame containing the cumulative number of Ebola virus disease cases in Sierra Leone, Africa, recorded from May 1, 2014 to December 16, 2015.
data(ebola_sleone_df)data(ebola_sleone_df)
A data frame with 110 observations and 2 variables:
Integer indicating the number of days since May 1, 2014
Integer representing the cumulative number of Ebola cases reported
The dataset name has been kept as 'ebola_sleone_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the MMAC package version 0.1.2
This dataset, ebola_survey_tbl_df, is a tibble containing responses from a poll conducted in New York City between October 26th and 28th, 2014. The poll was conducted shortly after a doctor who had treated Ebola patients in Guinea was diagnosed with Ebola in New York City. Participants were asked whether they favored a "mandatory 21-day quarantine for anyone who has come in contact with an Ebola patient". The survey included responses from 1,042 adults residing in New York.
data(ebola_survey_tbl_df)data(ebola_survey_tbl_df)
A tibble with 1,042 observations and 1 variable:
Factor with two levels indicating whether the respondent supports a mandatory 21-day quarantine for individuals who have come in contact with an Ebola patient
The dataset name has been kept as 'ebola_survey_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Data taken from the openintro package version 2.5.0
This dataset, ecoli_infections_df, is a data frame containing the weekly number of reported disease cases caused by Escherichia coli in the state of North Rhine-Westphalia (Germany) from January 2001 to May 2013. The data excludes cases of EHEC (enterohemorrhagic E. coli) and HUS (hemolytic uremic syndrome).
data(ecoli_infections_df)data(ecoli_infections_df)
A data frame with 646 observations and 3 variables:
Numeric variable indicating the calendar year of observation
Numeric variable indicating the calendar week (1 to 52 or 53)
Numeric variable representing the number of reported E. coli cases
The dataset name has been kept as 'ecoli_infections_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the tscount package version 1.4.3
This dataset, ehec_infections_df, is a data frame containing the weekly number of reported EHEC/HUS infections in the state of North Rhine-Westphalia (Germany) from January 2001 to May 2013.
data(ehec_infections_df)data(ehec_infections_df)
A data frame with 646 observations and 3 variables:
Numeric variable indicating the calendar year of observation
Numeric variable indicating the calendar week (1 to 52 or 53)
Numeric variable representing the number of reported EHEC/HUS cases
The dataset name has been kept as 'ehec_infections_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the tscount package version 1.4.3
This dataset, flu_enrich_df, is a data frame containing gene-set enrichment information for genes that have been identified as having an effect on influenza-virus replication.
data(flu_enrich_df)data(flu_enrich_df)
A data frame with 5719 observations and 3 variables:
Numeric vector representing gene identifiers with an effect on influenza-virus replication
Integer vector representing the size of each gene set
Factor vector representing Gene Ontology terms associated with each gene set
The dataset name has been kept as 'flu_enrich_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the rvalues package version 0.7.1
This dataset, fungal_infections_df, is a data frame containing results from a clinical trial on the success of a particular treatment for fungal infections across five research units. Interest in the study focuses on the treatment effect.
data(fungal_infections_df)data(fungal_infections_df)
A data frame with 10 observations and 4 variables:
Numeric vector indicating the number of treatment successes
Numeric vector indicating the number of treatment failures
Factor with 2 levels indicating treatment group (control, treated)
Factor with 5 levels indicating the research center where the trial was conducted
The dataset name has been kept as 'fungal_infections_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the cond package version 1.2-4
Retrieves real-time COVID-19 totals for all continents from the 'disease.sh' API.
get_covid_stats_by_continent( yesterday = FALSE, twoDaysAgo = FALSE, sort = NULL, allowNull = FALSE )get_covid_stats_by_continent( yesterday = FALSE, twoDaysAgo = FALSE, sort = NULL, allowNull = FALSE )
yesterday |
Logical. If |
twoDaysAgo |
Logical. If |
sort |
Character. Field to sort results by. Options include: |
allowNull |
Logical. If |
This function retrieves COVID-19 summary data for each continent. You may specify whether to get data from today, yesterday, or two days ago. Requires an active internet connection.
A data frame containing:
continent: Continent name.
updated: Last updated timestamp (as POSIXct in UTC).
cases: Total confirmed cases.
todayCases: New confirmed cases today.
deaths: Total deaths.
todayDeaths: New deaths today.
population: Continent population estimate.
Returns NULL if the API is unavailable or an error occurs.
Requires internet access. Function fails gracefully if API is unavailable.
API Docs: https://disease.sh/docs/#/COVID-19
## Not run: # Get current COVID-19 stats for all continents stats <- get_covid_stats_by_continent() if (!is.null(stats)) { print(stats) } # Get yesterday's data sorted by number of cases stats_yesterday <- get_covid_stats_by_continent(yesterday = TRUE, sort = "cases") ## End(Not run)## Not run: # Get current COVID-19 stats for all continents stats <- get_covid_stats_by_continent() if (!is.null(stats)) { print(stats) } # Get yesterday's data sorted by number of cases stats_yesterday <- get_covid_stats_by_continent(yesterday = TRUE, sort = "cases") ## End(Not run)
Retrieves real-time COVID-19 totals for all countries from the 'disease.sh' API.
get_covid_stats_by_country( yesterday = FALSE, twoDaysAgo = FALSE, sort = NULL, allowNull = FALSE )get_covid_stats_by_country( yesterday = FALSE, twoDaysAgo = FALSE, sort = NULL, allowNull = FALSE )
yesterday |
Logical. If |
twoDaysAgo |
Logical. If |
sort |
Character. Field to sort results by. Options include: |
allowNull |
Logical. If |
This function fetches COVID-19 summary statistics for each country. Useful for global surveillance or international comparisons. Requires an active internet connection.
A data frame containing:
country: Country name.
updated: Last updated timestamp (as POSIXct in UTC).
cases: Total confirmed cases.
todayCases: New confirmed cases today.
deaths: Total deaths.
todayDeaths: New deaths today.
population: Population estimate for each country.
Returns NULL if the API is unavailable or an error occurs.
Requires internet access. Function fails gracefully if API is unavailable.
API Docs: https://disease.sh/docs/#/COVID-19
## Not run: # Get real-time COVID-19 data for all countries all_countries <- get_covid_stats_by_country() if (!is.null(all_countries)) { head(all_countries) } # Get sorted data by number of deaths reported yesterday yesterday_deaths <- get_covid_stats_by_country(yesterday = TRUE, sort = "deaths") ## End(Not run)## Not run: # Get real-time COVID-19 data for all countries all_countries <- get_covid_stats_by_country() if (!is.null(all_countries)) { head(all_countries) } # Get sorted data by number of deaths reported yesterday yesterday_deaths <- get_covid_stats_by_country(yesterday = TRUE, sort = "deaths") ## End(Not run)
Retrieves COVID-19 totals for a given country using the 'disease.sh' API.
get_covid_stats_by_country_name( country, yesterday = FALSE, twoDaysAgo = FALSE, strict = TRUE, allowNull = FALSE )get_covid_stats_by_country_name( country, yesterday = FALSE, twoDaysAgo = FALSE, strict = TRUE, allowNull = FALSE )
country |
Character. A country name, ISO2, ISO3 code, or country ID. |
yesterday |
Logical. If |
twoDaysAgo |
Logical. If |
strict |
Logical. If |
allowNull |
Logical. If |
This function accesses COVID-19 data for a specific country based on its name or ISO code. Requires an active internet connection.
A data frame with the following columns:
country: Country name.
updated: Timestamp of last update (POSIXct in UTC).
cases: Total confirmed cases.
todayCases: New confirmed cases today.
deaths: Total deaths.
recovered: Total recoveries.
population: Estimated population.
Returns NULL if the API is unavailable, the country is not found, or an error occurs.
Requires internet connection. Function fails gracefully if API is unavailable.
API Docs: https://disease.sh/docs/#/COVID-19
## Not run: # Get data for Brazil brazil_data <- get_covid_stats_by_country_name("Brazil") if (!is.null(brazil_data)) { print(brazil_data) } # Get data for the USA using ISO2 code usa_data <- get_covid_stats_by_country_name("US", yesterday = TRUE) ## End(Not run)## Not run: # Get data for Brazil brazil_data <- get_covid_stats_by_country_name("Brazil") if (!is.null(brazil_data)) { print(brazil_data) } # Get data for the USA using ISO2 code usa_data <- get_covid_stats_by_country_name("US", yesterday = TRUE) ## End(Not run)
Retrieves real-time COVID-19 totals for one or more U.S. states from the 'disease.sh' API.
get_covid_stats_for_state(states, yesterday = FALSE, allowNull = FALSE)get_covid_stats_for_state(states, yesterday = FALSE, allowNull = FALSE)
states |
A character string with the name of a U.S. state or a comma-separated list of state names. Names must be spelled correctly. |
yesterday |
Logical. If |
allowNull |
Logical. If |
This function sends a GET request to the 'disease.sh' API for COVID-19 statistics in one or more U.S. states. If multiple states are passed, they must be comma-separated and correctly spelled. The 'updated' field is returned in milliseconds and is converted to a POSIXct datetime. Requires an active internet connection.
A data frame containing the following columns:
state: State name.
updated: Last updated timestamp (converted to human-readable datetime in UTC).
cases: Total confirmed cases.
todayCases: New confirmed cases today.
deaths: Total deaths.
todayDeaths: New deaths today.
population: State population estimate.
Returns NULL if the API is unavailable, the state(s) are not found, or an error occurs.
Requires an internet connection. Function fails gracefully if API is unavailable.
API Docs: https://disease.sh/docs/#/COVID-19
## Not run: # Retrieve COVID-19 data for California ca <- get_covid_stats_for_state("California") if (!is.null(ca)) { print(ca) } # Retrieve yesterday's data for New York and Texas ny_tx <- get_covid_stats_for_state("New York,Texas", yesterday = TRUE) ## End(Not run)## Not run: # Retrieve COVID-19 data for California ca <- get_covid_stats_for_state("California") if (!is.null(ca)) { print(ca) } # Retrieve yesterday's data for New York and Texas ny_tx <- get_covid_stats_for_state("New York,Texas", yesterday = TRUE) ## End(Not run)
Retrieves real-time global statistics on COVID-19 from the 'disease.sh' API.
get_global_covid_stats()get_global_covid_stats()
This function sends a GET request to the 'disease.sh' API and parses the returned JSON into a structured and user-friendly data frame. The timestamp is converted to a readable date-time format (in UTC). Requires an active internet connection.
A data frame with the following columns:
updated: Last updated time (as a human-readable date-time).
cases: Total confirmed cases worldwide.
newCases: Number of new confirmed cases today.
deaths: Total confirmed deaths worldwide.
recovered: Total number of recovered patients.
newRecov: Number of recovered patients today.
active: Current active cases.
critical: Current number of critical cases.
tests: Total number of tests performed.
pop: Estimated global population.
countries: Number of countries affected.
Returns NULL if the API is unavailable or an error occurs.
An internet connection is required to use this function. Function fails gracefully if API is unavailable.
API Docs: https://disease.sh/docs/#/COVID-19
## Not run: global_stats <- get_global_covid_stats() if (!is.null(global_stats)) { print(global_stats) } ## End(Not run)## Not run: global_stats <- get_global_covid_stats() if (!is.null(global_stats)) { print(global_stats) } ## End(Not run)
Retrieves ILI data for the 2019 and 2020 influenza outbreaks from the US CDC.
get_influenza_cdc_ili()get_influenza_cdc_ili()
This endpoint provides historical data for flu-like symptoms reported in the United States, sourced from the CDC ILINet. Requires an active internet connection.
A list containing:
updated: Last update timestamp (POSIXct).
source: Source of the data.
data: A data frame with the following columns:
week: Week of reporting.
age 5-24, age 25-49, age 50-64, age 64+: ILI counts per age group.
totalILI: Total ILI cases.
totalPatients: Total patients.
Returns NULL if the API is unavailable or an error occurs.
Requires internet connection. Function fails gracefully if API is unavailable.
API Docs: https://disease.sh/docs/#/Influenza/get_v3_influenza_cdc_ILINet
## Not run: ili_data <- get_influenza_cdc_ili() if (!is.null(ili_data)) { print(ili_data$updated) head(ili_data$data) } ## End(Not run)## Not run: ili_data <- get_influenza_cdc_ili() if (!is.null(ili_data)) { print(ili_data$updated) head(ili_data$data) } ## End(Not run)
Retrieves real-time COVID-19 totals from the 'disease.sh' API for all 50 U.S. states, as well as U.S. territories (e.g., Puerto Rico, Guam), special jurisdictions (e.g., Veteran Affairs, U.S. Military), and others (e.g., cruise ships, repatriated individuals).
get_us_states_covid_stats()get_us_states_covid_stats()
This function sends a GET request to the 'disease.sh' API endpoint for US state-level COVID-19 statistics and parses the response into a structured data frame. The timestamp is converted to a readable date-time format (in UTC). Requires an active internet connection.
A data frame with the following columns:
state: Name of the U.S. state.
cases: Total confirmed cases in the state.
todayCases: New confirmed cases today.
deaths: Total deaths in the state.
todayDeaths: New deaths today.
active: Current active cases.
population: Estimated state population.
Returns NULL if the API is unavailable or an error occurs.
An internet connection is required to use this function. Function fails gracefully if API is unavailable.
API Docs: https://disease.sh/docs/#/COVID-19:
## Not run: us_states_stats <- get_us_states_covid_stats() if (!is.null(us_states_stats)) { head(us_states_stats) } ## End(Not run)## Not run: us_states_stats <- get_us_states_covid_stats() if (!is.null(us_states_stats)) { head(us_states_stats) } ## End(Not run)
This dataset, gonorrhea_ma_df, is a data frame containing weekly cases of gonorrhea in Massachusetts between 2006 and 2015.
data(gonorrhea_ma_df)data(gonorrhea_ma_df)
A data frame with 422 observations and 4 variables:
Integer vector representing the number of weekly gonorrhea cases
Numeric vector representing the year of observation (2006–2015)
Numeric vector representing the epidemiological week (1–52)
Numeric vector representing the continuous time index
The dataset name has been kept as 'gonorrhea_ma_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the epimdr package version 0.6-5
This dataset, hepatitisA_df, is a data frame containing information from a cross-sectional survey conducted in 1964 on the prevalence of hepatitis A in individuals from Bulgaria. The surveyed population includes individuals aged between 1 and 86 years.
data(hepatitisA_df)data(hepatitisA_df)
A data frame with 83 observations and 3 variables:
Integer vector indicating the age of the individuals
Integer vector representing the frequency of individuals tested
Integer vector representing the frequency of individuals with antibodies to hepatitis A
The dataset name has been kept as 'hepatitisA_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the curstatCI package version 0.1.1
This dataset, india_dengue_tbl_df, is a tibble containing state and union territory-wise annual dengue/DHF (Dengue Hemorrhagic Fever) cases and deaths in India since 2017.
data(india_dengue_tbl_df)data(india_dengue_tbl_df)
A tibble with 432 observations and 5 variables:
Character vector indicating the State or Union Territory
Character vector indicating whether the entry refers to 'cases' or 'deaths'
Character vector indicating the year of observation
Character vector providing supplemental information
Numeric vector indicating the number of cases or deaths
The dataset name has been kept as 'india_dengue_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble (enhanced data frame). The original content has not been modified in any way.
Data taken from the denguedatahub package version 2.1.1
This package provides functions to access real-time infectious disease data from the 'disease.sh API', including COVID-19 global, US states, continent, and country statistics, vaccination coverage,influenza-like illness data from Centers for Disease Control and Prevention (CDC), also includes curated datasets on a variety of infectious diseases such as influenza, measles, dengue, Ebola, tuberculosis, meningitis, AIDS, and others.
infectiousR: Access Infectious and Epidemiological Data via 'disease.sh API'
Access Infectious and Epidemiological Data via 'disease.sh API'.
Maintainer: Renzo Caceres Rossi [email protected]
Useful links:
This dataset, influenza_ice_df, is a data frame containing monthly incidence data of influenza-like illness (ILI) in Iceland between 1980 and 2009.
data(influenza_ice_df)data(influenza_ice_df)
A data frame with 360 observations and 3 variables:
Integer vector representing the month of observation (1–12)
Integer vector representing the year of observation (1980–2009)
Integer vector representing the monthly incidence of influenza-like illness
The dataset name has been kept as 'influenza_ice_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the epimdr package version 0.6-5
This dataset, influenza_infections_df, is a data frame containing the weekly number of reported influenza cases in the state of North Rhine-Westphalia (Germany) from January 2001 to May 2013.
data(influenza_infections_df)data(influenza_infections_df)
A data frame with 646 observations and 3 variables:
Numeric variable indicating the calendar year of observation
Numeric variable indicating the calendar week (1 to 52 or 53)
Numeric variable representing the number of reported influenza cases
The dataset name has been kept as 'influenza_infections_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the tscount package version 1.4.3
This dataset, influenza_pneumonia_ts, is a time series containing monthly pneumonia and influenza deaths
per 10,000 people in the United States over a period of 11 years, from 1968 to 1978.
data(influenza_pneumonia_ts)data(influenza_pneumonia_ts)
A time series object with 132 monthly observations:
Monthly pneumonia and influenza deaths per 10,000 people in the United States from 1968 to 1978.
The dataset name has been kept as influenza_pneumonia_ts to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
infectiousR package and assists users in identifying its specific characteristics.
The suffix _ts indicates that the dataset is a time series object. The original content has not been modified
in any way.
Data taken from the astsa package version 2.2.
This dataset, influenza_vax_survey_df, is a data frame containing aggregated responses from three RAND American Life Panel (ALP) surveys regarding individuals' probability of vaccinating for influenza. The responses were discretized to "Never" (0%), "Always" (100%), or "Sometimes" (any other value). After merging, missing responses were coded as "Missing", and respondents were grouped and counted by all three coded responses.
data(influenza_vax_survey_df)data(influenza_vax_survey_df)
A data frame with 117 observations and 6 variables:
Factor indicating which of the three ALP surveys the response came from
Integer indicating frequency count of grouped respondents
Integer identifier for each subject
Factor with 4 levels: "Never", "Sometimes", "Always", and "Missing"
Date indicating the start of the survey
Date indicating the end of the survey
The dataset name has been kept as 'influenza_vax_survey_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the ggalluvial package version 0.12.5
This dataset, korea_dengue_tbl_df, is a tibble containing information on imported dengue cases in Korea from the years 2011 to 2015. The data were collected by the Korea Centers for Disease Control and Prevention (KCDC).
data(korea_dengue_tbl_df)data(korea_dengue_tbl_df)
A tibble with 33 observations and 7 variables:
Character vector indicating the country of origin of the dengue cases
Character vector indicating the region within the country
Character vector indicating the number of imported cases in 2011
Character vector indicating the number of imported cases in 2012
Character vector indicating the number of imported cases in 2013
Character vector indicating the number of imported cases in 2014
Character vector indicating the number of imported cases in 2015
The dataset name has been kept as 'korea_dengue_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Data taken from the denguedatahub package version 2.1.1
This dataset, malaria_mice_df, is a data frame containing daily data on laboratory mice infected with various strains of *Plasmodium chaubaudi*.
data(malaria_mice_df)data(malaria_mice_df)
A data frame with 1300 observations and 11 variables:
Integer vector indicating the parasite line
Integer vector representing the day of observation
Integer vector identifying the box where the mouse was housed
Integer vector identifying the individual mouse
Factor indicating the treatment group (6 levels)
Integer vector used to identify individual measurements
Numeric vector indicating the weight of the mouse
Integer vector indicating glucose levels
Numeric vector representing red blood cell counts
Integer vector identifying sample number
Numeric vector indicating parasitemia levels
The dataset name has been kept as 'malaria_mice_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the epimdr package version 0.6-5
This dataset, measles_infections_df, is a data frame containing the weekly number of reported measles infections in the state of North Rhine-Westphalia (Germany) from January 2001 to May 2013.
data(measles_infections_df)data(measles_infections_df)
A data frame with 646 observations and 3 variables:
Numeric variable indicating the calendar year of observation
Numeric variable indicating the calendar week (1 to 52 or 53)
Numeric variable representing the number of reported measles cases
The dataset name has been kept as 'measles_infections_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the tscount package version 1.4.3
This dataset, measles_survey_df, is a data frame containing the results of a survey
conducted by Roberts et al. (1995) on parents whose children had not been immunized
against measles during a recent campaign targeting all children in the first five years
of secondary school.
data(measles_survey_df)data(measles_survey_df)
A data frame with 307 observations and 11 variables:
Factor with 10 levels indicating the school
Factor with 2 levels indicating school form
Factor with 2 levels indicating if the form was returned
Factor with 2 levels indicating if consent was given
Factor with 2 levels indicating if the child had measles
Factor with 2 levels indicating previous immunization
Factor with 2 levels indicating concerns about side effects
Factor with 2 levels indicating whether GP advised
Factor with 2 levels indicating general refusal to vaccinate
Factor with 2 levels indicating the child was not seriously ill
Factor with 2 levels indicating GP advice against immunization
The dataset name has been kept as measles_survey_df to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
infectiousR package and assists users in identifying its specific characteristics.
The suffix _df indicates that the dataset is a data frame. The original content has not been modified
in any way.
Data taken from the SDaA package version 0.1-5
This dataset, meningitis_df, is a data frame containing data from a brief outbreak of meningococcal disease at the University of Illinois, Urbana-Champaign campus during the years 1991 and 1992.
data(meningitis_df)data(meningitis_df)
A data frame with 60 observations and 6 variables:
Integer indicating the matched set identifier
Integer indicator variable for case (1) or control (0)
Numeric value representing the reference time (e.g., time of exposure)
Integer indicating the number of ill roommates
Integer indicating the number of roommates who slept in the room
Integer indicator for whether the subject smokes (1 = yes, 0 = no)
The dataset name has been kept as 'meningitis_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the glmfitmiss package version 2.1.0
This dataset, rubella_austria_df, is a data frame containing prevalence data of rubella in 230 Austrian males older than three months, for whom the exact date of birth was known. Each individual was tested at the Institute of Virology, Vienna during the period 1–25 March 1988 for immunization against Rubella.
data(rubella_austria_df)data(rubella_austria_df)
A data frame with 225 observations and 3 variables:
Numeric vector representing age or time (in months or years as recorded)
Integer vector representing frequency count 1
Integer vector representing frequency count 2
The dataset name has been kept as 'rubella_austria_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the curstatCI package version 0.1.1
This dataset, rubella_peru_df, is a data frame containing rubella incidence data
by age as studied by Metcalf et al. (2011) in Peru.
data(rubella_peru_df)data(rubella_peru_df)
A data frame with 95 observations and 4 variables:
Numeric vector indicating the age of individuals
Integer vector indicating the number of rubella cases per age group
Integer vector indicating the cumulative number of cases by age
Integer vector representing the sample size for each age group
The dataset name has been kept as rubella_peru_df to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
infectiousR package and assists users in identifying its specific characteristics.
The suffix _df indicates that the dataset is a data frame. The original content has not been modified
in any way.
Data taken from the epimdr package version 0.6-5
This dataset, sars_canada_df, is a data frame containing information on the daily incidence of SARS (Severe Acute Respiratory Syndrome) cases in Canada during the 2003 outbreak. The data include new cases attributed to travel, household transmission, healthcare settings, and other sources.
data(sars_canada_df)data(sars_canada_df)
A data frame with 110 observations and 5 variables:
Date object representing the reporting date
Integer vector indicating new SARS cases linked to travel
Integer vector indicating new SARS cases from household transmission
Integer vector indicating new SARS cases from healthcare settings
Integer vector indicating new SARS cases from other or unknown sources
The dataset name has been kept as 'sars_canada_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the outbreaks package version 1.9.0
This dataset, smallpox_nigeria_df, is a data frame containing data on 32 cases of smallpox that occurred in Abakaliki, Nigeria, in 1967. These cases were first described by Thompson and Foege (1968) and occurred predominantly in a religious group that refused medical interventions.
data(smallpox_nigeria_df)data(smallpox_nigeria_df)
A data frame with 32 observations and 8 variables:
Integer identifier for each smallpox case
Date of symptom onset
Age of the individual (integer)
Factor with two levels indicating gender
Factor with two levels indicating if the individual was vaccinated
Factor with two levels indicating presence of vaccination scar
Factor with two levels; additional epidemiological classification
Factor with nine levels indicating compound of residence
The dataset name has been kept as 'smallpox_nigeria_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the outbreaks package version 1.9.0
This dataset, spanish_flu_df, is a data frame containing daily mortality data from the 1918 flu pandemic covering the period from 1918-09-01 through 1918-12-31 in Indiana, Kansas, and Philadelphia.
data(spanish_flu_df)data(spanish_flu_df)
A data frame with 122 observations and 4 variables:
Date of recorded mortality
Integer vector representing daily flu-related deaths in Indiana
Integer vector representing daily flu-related deaths in Kansas
Integer vector representing daily flu-related deaths in Philadelphia
The dataset name has been kept as 'spanish_flu_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the incidental package version 0.1
This dataset, streptomycin_tbl_df, is a tibble containing the results of a randomized, placebo-controlled, prospective 2-arm trial evaluating the use of streptomycin (2 grams daily) versus placebo in the treatment of tuberculosis among 107 young patients. The study was conducted by the Streptomycin in Tuberculosis Trials Committee and published in the British Medical Journal in 1948.
data(streptomycin_tbl_df)data(streptomycin_tbl_df)
A tibble with 107 observations and 13 variables:
Character identifier for each patient
Factor indicating treatment arm: streptomycin (A2) or placebo (A1)
Numeric dose of streptomycin in grams
Numeric dose of para-aminosalicylic acid (PAS) in grams
Factor with two levels indicating patient gender
Factor indicating the baseline clinical condition of the patient
Factor indicating baseline temperature category
Factor indicating baseline erythrocyte sedimentation rate (ESR) category
Factor indicating the presence or absence of lung cavitation at baseline
Factor indicating the level of resistance to streptomycin
Factor describing radiological outcomes at 6 months
Numeric radiologic score at 6 months
Logical indicator of clinical improvement
The dataset name has been kept as 'streptomycin_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble (a modern form of data frame). The original content has not been modified in any way.
Data taken from the medicaldata package version 0.2.0
This dataset, us_covid_cases_df, is a data frame containing the number of
laboratory-confirmed COVID-19 cases in the United States, as reported by the
Centers for Disease Control and Prevention (CDC), between January 1, 2020 and
May 11, 2023, the end of the public health emergency declaration.
data(us_covid_cases_df)data(us_covid_cases_df)
A data frame with 1227 observations and 2 variables:
Date of report (class Date)
Integer vector indicating the number of confirmed cases reported on each date
The dataset name has been kept as us_covid_cases_df to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
infectiousR package and assists users in identifying its specific characteristics.
The suffix _df indicates that the dataset is a data frame. The original content has not been modified
in any way.
Data taken from the cpr package version 0.4.0
This function lists all datasets available in the 'infectiousR' package. If the 'infectiousR' package is not loaded, it stops and shows an error message. If no datasets are available, it returns a message and an empty vector.
view_datasets_infectiousR()view_datasets_infectiousR()
A character vector with the names of the available datasets. If no datasets are found, it returns an empty character vector.
if (requireNamespace("infectiousR", quietly = TRUE)) { library(infectiousR) view_datasets_infectiousR() }if (requireNamespace("infectiousR", quietly = TRUE)) { library(infectiousR) view_datasets_infectiousR() }
This dataset, zika_girardot_df, is a data frame containing the daily incidence of Zika virus disease in Girardot, Colombia, during 2015.
data(zika_girardot_df)data(zika_girardot_df)
A data frame with 93 observations and 2 variables:
Date object representing the date of reported Zika cases
Integer vector indicating the number of daily reported Zika cases
The dataset name has been kept as 'zika_girardot_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the outbreaks package version 1.9.0
This dataset, zika_sanandres_df, is a data frame containing the daily incidence of Zika virus disease in San Andres, Colombia, during 2015.
data(zika_sanandres_df)data(zika_sanandres_df)
A data frame with 101 observations and 2 variables:
Date object representing the date of reported Zika cases
Integer vector indicating the number of daily reported Zika cases
The dataset name has been kept as 'zika_sanandres_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the infectiousR package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the outbreaks package version 1.9.0