| Title: | Access Indian Data via Public APIs and Curated Datasets |
|---|---|
| Description: | Provides functions to access data from public RESTful APIs including 'World Bank API', and 'REST Countries API', retrieving real-time or historical data related to India, such as economic indicators, and international demographic and geopolitical indicators. Additionally, the package includes one of the largest curated collections of open datasets focused on India, covering topics such as population, economy, weather, politics, health, biodiversity, sports, agriculture, cybercrime, infrastructure, and more. The package supports reproducible research and teaching by integrating reliable international APIs and structured datasets from public, academic, and government sources. For more information on the APIs, see: 'World Bank API' <https://datahelpdesk.worldbank.org/knowledgebase/articles/889392>, 'REST Countries API' <https://restcountries.com/>. |
| Authors: | Renzo Caceres Rossi [aut, cre] (ORCID: <https://orcid.org/0009-0005-0744-854X>) |
| Maintainer: | Renzo Caceres Rossi <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-13 09:45:03 UTC |
| Source: | https://github.com/lightbluetitan/indiapis |
This dataset, birds_watching_tbl_df, is a tibble containing detailed information on bird species observed in India, including species names, scientific names, the date of last observation, and total recorded sightings. The dataset preserves the original structure from its source on Kaggle.
data(birds_watching_tbl_df)data(birds_watching_tbl_df)
A tibble with 490 observations and 4 variables:
Common name of the bird species (character)
Scientific name of the bird species (character)
Date of last recorded observation (character)
Total number of recorded sightings (numeric)
The dataset name has been kept as 'birds_watching_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/prajwaldongre/indian-bird-observations-tracking-species
This dataset, BirthDeathRates_df, is a data frame containing data on human birth and death rates in India over the 20th century. It includes the year, birth rate, and death rate for each recorded period.
data(BirthDeathRates_df)data(BirthDeathRates_df)
A data frame with 27 observations and 3 variables:
Year of observation (factor)
Birth rate (numeric)
Death rate (numeric)
The dataset name has been kept as 'BirthDeathRates_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame object. The original content has not been modified in any way.
Data taken from the gpk package version 1.0
This dataset, BombayPlague1905_df, is a data frame containing the number of plague deaths per week in Bombay in 1905–06. The data was originally reported by Kermack and McCormick (1927). Bombay is the former name for the Indian coastal city Mumbai, which is the capital of Maharashtra and one of the largest cities in the world.
data(BombayPlague1905_df)data(BombayPlague1905_df)
A data frame with 32 observations and 2 variables:
Week number of the observation period (integer)
Cumulative number of plague deaths (integer)
The dataset name has been kept as 'BombayPlague1905_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame object. The original content has not been modified in any way.
Data taken from the primer package version 1.2.0
This dataset, BurdwanRiceYield_df, is a data frame containing yearly rice yield data for the Burdwan district of West Bengal, India, over a period of 39 years. It includes the year and the yield in tonnes per hectare for each recorded year.
data(BurdwanRiceYield_df)data(BurdwanRiceYield_df)
A data frame with 39 observations and 2 variables:
Year of observation (character)
Rice yield in tonnes per hectare (numeric)
The dataset name has been kept as 'BurdwanRiceYield_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame object. The original content has not been modified in any way.
Data taken from the weatherindices package version 0.1.0
This dataset, BurdwanWeather_df, is a data frame containing weekly weather data for the rice growing season in the Burdwan district of West Bengal, India, over a period of 39 years. It includes the date, standard meteorological week, week number, and four weather variables: maximum temperature, minimum temperature, precipitation, and relative humidity.
data(BurdwanWeather_df)data(BurdwanWeather_df)
A data frame with 741 observations and 7 variables:
Date of observation (character)
Standard Meteorological Week (integer)
Week number within the season (integer)
Maximum temperature in degrees Celsius (numeric)
Minimum temperature in degrees Celsius (numeric)
Total precipitation in millimeters (numeric)
Relative humidity in percentage (numeric)
The dataset name has been kept as 'BurdwanWeather_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame object. The original content has not been modified in any way.
Data taken from the weatherindices package version 0.1.0
This dataset, ButterflySpecies_df, is a data frame containing the distribution of butterfly species counts among five groups across different localities in India. It includes information on the total number of species and counts for each butterfly group such as Skippers, Swallow tails, Whites & Yellows, Blues, and Brush Footed.
data(ButterflySpecies_df)data(ButterflySpecies_df)
A data frame with 44 observations and 9 variables:
Serial number identifier (integer)
Geographic area within India (factor with 8 levels)
Specific locality name (factor with 44 levels)
Total number of butterfly species in the locality (integer)
Count of Skippers species (integer)
Count of Swallow tail species (integer)
Count of Whites and Yellows species (integer)
Count of Blues species (integer)
Count of Brush Footed species (integer)
The dataset name has been kept as 'ButterflySpecies_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame object. The original content has not been modified in any way.
Data taken from the gpk package version 1.0
This dataset, CyberCrime_India_tbl_df, is a tibble containing cybercrime statistics across Indian cities. It includes counts of various types of cybercrimes such as personal revenge, anger, fraud, extortion, causing disrepute, prank, sexual exploitation, disruption of public service, illegal drug sales, business development, piracy spreading, psychological offenses, information theft, abetment to suicide, and others, along with the total number of cases. The dataset preserves the original structure from its source on Kaggle.
data(CyberCrime_India_tbl_df)data(CyberCrime_India_tbl_df)
A tibble with 191 observations and 17 variables:
City name (character)
Number of cybercrime cases related to personal revenge (numeric)
Number of cybercrime cases related to anger (numeric)
Number of fraud-related cybercrime cases (numeric)
Number of extortion-related cybercrime cases (numeric)
Number of cases causing disrepute (numeric)
Number of prank-related cybercrime cases (numeric)
Number of sexual exploitation cases (numeric)
Number of cases disrupting public services (numeric)
Number of cases involving sale or purchase of illegal drugs (numeric)
Number of cases related to developing own business (numeric)
Number of cases involving spreading piracy (numeric)
Number of psychological or pervert-related cases (numeric)
Number of information theft cases (numeric)
Number of cases of abetment to suicide (numeric)
Number of other types of cybercrime cases (numeric)
Total number of cybercrime cases (numeric)
The dataset name has been kept as 'CyberCrime_India_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/seanangelonathanael/dataset-cybercrime-in-india
This dataset, DataScienceJobs_tbl_df, is a tibble containing job listings related to Data Science positions across India. It includes company names, job titles, minimum experience required, average, minimum and maximum salaries, and the number of salary reports. The dataset preserves the original structure from its source on Kaggle.
data(DataScienceJobs_tbl_df)data(DataScienceJobs_tbl_df)
A tibble with 1,602 observations and 8 variables:
Original column from the source file (numeric)
Name of the company offering the job (character)
Title of the job position (character)
Minimum experience required in years (numeric)
Average salary offered (numeric)
Minimum salary offered (numeric)
Maximum salary offered (numeric)
Number of salary reports for the job (numeric)
The dataset name has been kept as 'DataScienceJobs_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/madhurpant/data-science-jobs-in-india
This dataset, DelhiPotatoPrices_ts, is a time series containing the monthly average potato prices of the Delhi market from January 2010 to July 2020.
data(DelhiPotatoPrices_ts)data(DelhiPotatoPrices_ts)
A time series with 127 time points and 1 variable:
Monthly average potato price in the Delhi market (numeric)
The dataset name has been kept as 'DelhiPotatoPrices_ts' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'ts' indicates that the dataset is a time series object. The original content has not been modified in any way.
Data taken from the stlELM package version 0.1.1
This dataset, diesel_fuelprice_tbl_df, is a tibble containing daily diesel fuel price data across multiple cities and states in India from 2002 to 2020. It includes city and state information, along with the date and diesel price rate. The dataset preserves the original structure from its source on Kaggle.
data(diesel_fuelprice_tbl_df)data(diesel_fuelprice_tbl_df)
A tibble with 17,235 observations and 4 variables:
Name of the city (character)
Date of the observation (Date)
Diesel price rate (numeric)
Name of the state (character)
The dataset name has been kept as 'diesel_fuelprice_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/sudhirnl7/fuel-price-in-india
This dataset, exports_imports_tbl_df, is a tibble containing export and import data for India from 1997 to July 2022. It includes information on country-wise exports, imports, total trade, and trade balance along with the financial year start and end dates. The dataset preserves the original structure from its source on Kaggle.
data(exports_imports_tbl_df)data(exports_imports_tbl_df)
A tibble with 5,994 observations and 7 variables:
Country name (character)
Export value (numeric)
Import value (numeric)
Total trade value (numeric)
Trade balance value (numeric)
Financial year start (numeric)
Financial year end (character)
The dataset name has been kept as 'exports_imports_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/ramjasmaurya/exports-and-imports-of-india19972022
This dataset, GDPIndia_tbl_df, is a tibble containing historical GDP data for India from 1960 to 2022. It includes columns as present in the original source file, preserving their exact names and formats. The dataset preserves the original structure from its source on Kaggle.
data(GDPIndia_tbl_df)data(GDPIndia_tbl_df)
A tibble with 63 observations and 5 variables:
Original column from the source file (numeric)
Original column from the source file (character)
Original column from the source file (character)
Original column from the source file (character)
Original column from the source file (character)
The dataset name has been kept as 'GDPIndia_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/dheerajmukati/india-gdp-19602022
Retrieves comprehensive country information for India from the REST Countries API. This function fetches data including official and common names, geographical information, capital, area, population, and languages.
get_country_info_in()get_country_info_in()
This function makes a request to the REST Countries API v3.1 endpoint specifically for India using full text search. It handles API errors gracefully and returns NULL if the request fails or no data is found.
A tibble with one row containing India's country information:
Common name of the country
Official name of the country
Geographic region
Geographic subregion
Capital city(ies)
Total area in square kilometers
Total population
Languages spoken (comma-separated)
## Not run: # Get India information in_info <- get_country_info_in() print(in_info) ## End(Not run)## Not run: # Get India information in_info <- get_country_info_in() print(in_info) ## End(Not run)
Retrieves India's under-5 mortality rate, measured as the number of deaths
of children under five years of age per 1,000 live births, for the years 2010 to 2022
using the World Bank Open Data API. The indicator used is SH.DYN.MORT.
get_india_child_mortality()get_india_child_mortality()
This function sends a GET request to the World Bank API.
If the API request fails or returns an error status code,
the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "Mortality rate, under-5 (per 1,000 live births)")
country: Country name ("India")
year: Year of the data (integer)
value: Mortality rate (per 1,000 live births)
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/SH.DYN.MORT
if (interactive()) { get_india_child_mortality() }if (interactive()) { get_india_child_mortality() }
Retrieves India's Consumer Price Index (CPI), with 2010 as the base year (index = 100),
for the years 2010 to 2022 using the World Bank Open Data API.
The indicator used is FP.CPI.TOTL.
get_india_cpi()get_india_cpi()
This function sends a GET request to the World Bank API.
If the API request fails or returns an error status code,
the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "Consumer price index (2010 = 100)")
country: Country name ("India")
year: Year of the data (integer)
value: Consumer Price Index (numeric, base year 2010 = 100)
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/FP.CPI.TOTL
if (interactive()) { get_india_cpi() }if (interactive()) { get_india_cpi() }
Retrieves India's energy use per capita, measured in kilograms of oil equivalent,
for the years 2010 to 2022 using the World Bank Open Data API.
The indicator used is EG.USE.PCAP.KG.OE.
get_india_energy_use()get_india_energy_use()
This function sends a GET request to the World Bank API.
If the API request fails or returns an error status code,
the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "Energy use (kg of oil equivalent per capita)")
country: Country name ("India")
year: Year of the data (integer)
value: Energy use per capita (in kg of oil equivalent)
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/EG.USE.PCAP.KG.OE
if (interactive()) { get_india_energy_use() }if (interactive()) { get_india_energy_use() }
Retrieves India's Gross Domestic Product (GDP) at current US dollars
for the years 2010 to 2022 using the World Bank Open Data API.
The indicator used is NY.GDP.MKTP.CD.
get_india_gdp()get_india_gdp()
This function sends a GET request to the World Bank API.
If the API request fails or returns an error status code,
the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "GDP (current US$)")
country: Country name ("India")
year: Year of the data (integer)
value: GDP in current US dollars (numeric)
value_label: GDP formatted with comma separators
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD
GET, fromJSON, as_tibble, comma
if (interactive()) { get_india_gdp() }if (interactive()) { get_india_gdp() }
Retrieves the number of hospital beds per 1,000 people in India
for the years 2010 to 2022 using the World Bank Open Data API.
The indicator used is SH.MED.BEDS.ZS.
get_india_hospital_beds()get_india_hospital_beds()
This function sends a GET request to the World Bank API.
If the API request fails or returns an error status code,
the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "Hospital beds (per 1,000 people)")
country: Country name ("India")
year: Year of the data (integer)
value: Number of hospital beds per 1,000 people (numeric)
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/SH.MED.BEDS.ZS
if (interactive()) { get_india_hospital_beds() }if (interactive()) { get_india_hospital_beds() }
Retrieves India's life expectancy at birth (total, in years) for the years 2010 to 2022
using the World Bank Open Data API. The indicator used is SP.DYN.LE00.IN.
get_india_life_expectancy()get_india_life_expectancy()
This function sends a GET request to the World Bank API.
If the API request fails or returns an error status code,
the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "Life expectancy at birth, total (years)")
country: Country name ("India")
year: Year of the data (integer)
value: Life expectancy at birth (in years)
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/SP.DYN.LE00.IN
if (interactive()) { get_india_life_expectancy() }if (interactive()) { get_india_life_expectancy() }
Retrieves India's adult literacy rate, defined as the percentage of people
ages 15 and above who can read and write, for the years 2010 to 2022
using the World Bank Open Data API. The indicator used is SE.ADT.LITR.ZS.
get_india_literacy_rate()get_india_literacy_rate()
This function sends a GET request to the World Bank API.
If the API request fails or returns an error status code,
the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "Literacy rate, adult total (
country: Country name ("India")
year: Year of the data (integer)
value: Adult literacy rate (percentage)
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/SE.ADT.LITR.ZS
if (interactive()) { get_india_literacy_rate() }if (interactive()) { get_india_literacy_rate() }
Retrieves India's total population for the years 2010 to 2022
using the World Bank Open Data API. The indicator used is SP.POP.TOTL.
get_india_population()get_india_population()
The function sends a GET request to the World Bank API.
If the API request fails or returns an error status code, the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name (e.g., "Population, total")
country: Country name ("India")
year: Year of the data (integer)
value: Population as a numeric value
value_label: Formatted population with commas (e.g., "1,400,000,000")
Requires internet connection. The data is retrieved in real time from the World Bank API.
World Bank Open Data API: https://data.worldbank.org/indicator/SP.POP.TOTL
GET, fromJSON, as_tibble, comma
if (interactive()) { get_india_population() }if (interactive()) { get_india_population() }
Retrieves India's total unemployment rate as a percentage of the total labor force for the years 2010 to 2022 using the World Bank Open Data API. The indicator used is SL.UEM.TOTL.ZS.
get_india_unemployment()get_india_unemployment()
This function sends a GET request to the World Bank API. If the API request fails or returns an error status code, the function returns NULL with an informative message.
A tibble with the following columns:
indicator: Indicator name
country: Country name (India)
year: Year of the data (integer)
value: Unemployment rate as percentage
Requires internet connection.
World Bank Open Data API: https://data.worldbank.org/indicator/SL.UEM.TOTL.ZS
## Not run: unemployment_data <- get_india_unemployment() print(unemployment_data) ## End(Not run)## Not run: unemployment_data <- get_india_unemployment() print(unemployment_data) ## End(Not run)
This dataset, GoldPricesIndia_df, is a data frame containing the monthly high and low prices (in rupees per gram) of 22-carat gold in six Indian cities: Chennai, Kolkatta, Bangalore, Madurai, Hyderabad, and Delhi. Data were collected from February 2022 to January 2023.
data(GoldPricesIndia_df)data(GoldPricesIndia_df)
A data frame with 12 observations and 13 variables:
Month of observation (character)
Lowest price in Chennai (numeric)
Highest price in Chennai (numeric)
Lowest price in Kolkatta (numeric)
Highest price in Kolkatta (numeric)
Lowest price in Bangalore (numeric)
Highest price in Bangalore (numeric)
Lowest price in Madurai (numeric)
Highest price in Madurai (numeric)
Lowest price in Hyderabad (numeric)
Highest price in Hyderabad (numeric)
Lowest price in Delhi (numeric)
Highest price in Delhi (numeric)
The dataset name has been kept as 'GoldPricesIndia_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame object. The original content has not been modified in any way.
Data taken from the neutrostat package version 0.0.2
This dataset, hospitalcount_tbl_df, is a tibble containing the count of hospitals in India by state and union territory. It includes the number of hospitals in the public sector, the private sector, and the total number of hospitals (public + private) for each state or UT. The dataset preserves the original structure from its source on Kaggle.
data(hospitalcount_tbl_df)data(hospitalcount_tbl_df)
A tibble with 37 observations and 4 variables:
Name of the state or union territory (character)
Number of hospitals in the public sector (numeric)
Number of hospitals in the private sector (numeric)
Total number of hospitals combining public and private sectors (numeric)
The dataset name has been kept as 'hospitalcount_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/gokulprakash22/hospitals-count-in-india-statewise
This dataset, India_census2011_tbl_df, is a tibble containing population statistics for Indian districts based on the 2011 Census. It includes district ranking, population, growth rate, sex ratio, and literacy statistics for each district. The dataset preserves the original structure from its source on Kaggle.
data(India_census2011_tbl_df)data(India_census2011_tbl_df)
A tibble with 610 observations and 7 variables:
District ranking (numeric)
District name (character)
State name (character)
Population count (numeric)
Population growth rate (character)
Sex ratio (number of females per 1000 males) (numeric)
Literacy rate (numeric)
The dataset name has been kept as 'India_census2011_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/shiivvvaam/indian-districts-population-data
This dataset, India_Companies_tbl_df, is a tibble containing information about notable companies headquartered in India, including those in the Fortune Global 500. It includes company names, industry, sector, headquarters location, founding year, notes, private or state ownership status, and whether the company is active or defunct. The dataset preserves the original structure from its source on Kaggle.
data(India_Companies_tbl_df)data(India_Companies_tbl_df)
A tibble with 493 observations and 8 variables:
Name of the company (character)
Industry classification (character)
Sector classification (character)
Primary headquarters location (character)
Year the company was founded (character)
Additional notes or remarks (character)
Ownership status: private or state-owned (character)
Company status: active or defunct (character)
The dataset name has been kept as 'India_Companies_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/mrmars1010/companies-in-india
This dataset, India_SharkTank_tbl_df, is a tibble containing detailed information on pitches presented on Shark Tank India. It includes episode and pitch numbers, brand names, business ideas, deal status, financial details (ask amount, equity, valuation, deal amount, equity, and valuation), presence of each shark during the pitch, whether each shark invested, total sharks invested, amount per shark, and equity per shark. The dataset preserves the original structure from its source on Kaggle.
data(India_SharkTank_tbl_df)data(India_SharkTank_tbl_df)
A tibble with 117 observations and 28 variables:
Episode number (numeric)
Pitch number within the episode (numeric)
Name of the brand presented (character)
Business idea description (character)
Indicator if a deal was made (numeric; 1 = yes, 0 = no)
Amount requested by the pitcher (numeric)
Equity percentage requested by the pitcher (numeric)
Valuation based on the pitcher’s ask (numeric)
Amount invested in the deal (numeric)
Equity percentage given in the deal (numeric)
Valuation based on the deal (numeric)
Indicator if Ashneer was present (numeric; 1 = yes, 0 = no)
Indicator if Anupam was present (numeric; 1 = yes, 0 = no)
Indicator if Aman was present (numeric; 1 = yes, 0 = no)
Indicator if Namita was present (numeric; 1 = yes, 0 = no)
Indicator if Vineeta was present (numeric; 1 = yes, 0 = no)
Indicator if Peyush was present (numeric; 1 = yes, 0 = no)
Indicator if Ghazal was present (numeric; 1 = yes, 0 = no)
Indicator if Ashneer invested (numeric; 1 = yes, 0 = no)
Indicator if Anupam invested (numeric; 1 = yes, 0 = no)
Indicator if Aman invested (numeric; 1 = yes, 0 = no)
Indicator if Namita invested (numeric; 1 = yes, 0 = no)
Indicator if Vineeta invested (numeric; 1 = yes, 0 = no)
Indicator if Peyush invested (numeric; 1 = yes, 0 = no)
Indicator if Ghazal invested (numeric; 1 = yes, 0 = no)
Total number of sharks who invested (numeric)
Investment amount per shark (numeric)
Equity percentage per shark (numeric)
The dataset name has been kept as 'India_SharkTank_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/shivavashishtha/shark-tank-india-dataset
This dataset, IndiaLandReforms_df, is a data frame containing information on politics and land reforms in India. It includes variables related to agricultural landholding patterns, rural development indicators, election outcomes, political participation, and socio-economic measures across different districts and years.
data(IndiaLandReforms_df)data(IndiaLandReforms_df)
A data frame with 2670 observations and 32 variables:
Mouza code or identifier (integer)
Year of observation (integer)
District code or identifier (integer)
Proportion of land cultivated (numeric)
Proportion of rural households (numeric)
Proportion of land below a certain threshold (numeric)
Proportion of rural households with a given characteristic (numeric)
Election year indicator (integer)
Pre-election indicator (integer)
Electoral variable - women in local councils (numeric)
Electoral variable - less cultivated land (numeric)
Electoral variable - medium cultivated land (numeric)
Electoral variable - small cultivated land (numeric)
Electoral variable - medium developed cultivated land (numeric)
Electoral variable - custom smallholder measure (numeric)
Electoral variable - custom big landholder measure (numeric)
Electoral variable - illiteracy rate (numeric)
Electoral variable - low-income households (numeric)
Political variable - left party ratio before adjustment (numeric)
Development variable - women in local councils (numeric)
Inflation rate (numeric)
Share of female employment in youth (numeric)
Number of seats won by incumbents (numeric)
Number of seats won by left parties (numeric)
Inflation flag indicator (numeric)
Incumbent flag indicator (numeric)
Left party flag indicator (numeric)
Political variable - left party ratio (numeric)
Inflation index for wages (numeric)
Inflation index for unspecified metric (numeric)
Inflation index for agricultural labor (numeric)
Gram Panchayat code or identifier (integer)
The dataset name has been kept as 'IndiaLandReforms_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame object. The original content has not been modified in any way.
Data taken from the pder package version 1.0-2
This dataset, indianPopulation_tbl_df, is a tibble containing census data and population projections for Indian states across multiple years. It includes state codes, abbreviations, names, and population figures for the years 1901, 1951, 2011, 2023, and 2024.
data(indianPopulation_tbl_df)data(indianPopulation_tbl_df)
A tibble with 36 observations and 8 variables:
Numeric state code (numeric)
State abbreviation (character)
Full state name (character)
Population in the year 1901 (numeric)
Population in the year 1951 (numeric)
Population in the year 2011 (numeric)
Population in the year 2023 (numeric)
Population in the year 2024 (numeric)
The dataset name has been kept as 'indianPopulation_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble object. The original content has not been modified in any way.
Data taken from the mapindia package version 1.0.1
This package provides functions to access data from public RESTful APIs including 'World Bank API', and 'REST Countries API', retrieving real-time or historical data related to India, such as economic indicators, and international demographic and geopolitical indicators. Additionally, the package includes one of the largest curated collections of open datasets focused on India, covering topics such as population, economy, weather, politics, health, biodiversity, sports, agriculture, cybercrime, infrastructure, and more.
IndiAPIs: Access Indian Data via Public APIs and Curated Datasets
Access Indian Data via Public APIs and Curated Datasets.
Maintainer: Renzo Caceres Rossi [email protected]
Useful links:
This dataset, IndiaPopulation_dt, is a data table containing the names of states and union territories in India along with their respective abbreviations and populations. The dataset also includes the total population of India. These are 2019 projections as reported in the Unique Identification Authority of India 2019-2020 Annual Report.
data(IndiaPopulation_dt)data(IndiaPopulation_dt)
A data.table with 39 observations and 3 variables:
Name of the state or union territory (character)
Abbreviation for the state or union territory (character)
Population in 2019 projection (numeric)
The dataset name has been kept as 'IndiaPopulation_dt' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'dt' indicates that the dataset is a data.table object. The original content has not been modified in any way.
Data taken from the covid19india package version 0.1.4
This dataset, IPLCricket_tbl_df, is a tibble containing match data from the Indian Premier League (IPL) played by teams representing different cities in India from 2008 to 2016.
data(IPLCricket_tbl_df)data(IPLCricket_tbl_df)
A tibble with 8,560 observations and 10 variables:
Season year of the IPL (numeric)
Unique match identifier (numeric)
Name of the batting team (character)
Name of the bowling team (character)
Inning number (numeric)
Over number (numeric)
Number of wickets taken in the over (numeric)
Number of dot balls in the over (numeric)
Runs scored in the over (numeric)
Run rate for the over (numeric)
The dataset name has been kept as 'IPLCricket_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble object. The original content has not been modified in any way.
Data taken from the gravitas package version 0.1.3
This dataset, petrol_fuelprice_tbl_df, is a tibble containing daily petrol fuel price data across multiple cities and states in India from 2002 to 2020. It includes city and state information, along with the date and petrol price rate. The dataset preserves the original structure from its source on Kaggle.
data(petrol_fuelprice_tbl_df)data(petrol_fuelprice_tbl_df)
A tibble with 5,048 observations and 4 variables:
Name of the city (character)
Date of the observation (Date)
Petrol price rate (numeric)
Name of the state (character)
The dataset name has been kept as 'petrol_fuelprice_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/sudhirnl7/fuel-price-in-india
This dataset, petrol_prices_tbl_df, is a tibble containing petrol price information across various cities in India. It includes the city name, date of the price record, and the petrol rate on that date. The dataset preserves the original structure from its source on Kaggle.
data(petrol_prices_tbl_df)data(petrol_prices_tbl_df)
A tibble with 1,024 observations and 3 variables:
Name of the city (character)
Date of the petrol price record (Date)
Petrol price rate (numeric)
The dataset name has been kept as 'petrol_prices_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/sandipdevre/petrol-prices-in-india
This dataset, rainfall_tbl_df, is a tibble containing historical monthly rainfall data for subdivisions in India from 1901 to 2021. It includes rainfall measurements for June, July, August, September, and the total for June to September, along with the year and subdivision name. The dataset preserves the original structure from its source on Kaggle.
data(rainfall_tbl_df)data(rainfall_tbl_df)
A tibble with 4,332 observations and 7 variables:
Name of the subdivision (character)
Year of observation (numeric)
Rainfall in June (numeric)
Rainfall in July (numeric)
Rainfall in August (numeric)
Rainfall in September (numeric)
Total rainfall from June to September (numeric)
The dataset name has been kept as 'rainfall_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/aksahaha/rainfall-india
This dataset, road_population_tbl_df, is a tibble containing detailed information about road infrastructure and population data for Indian states. It includes lengths of various road types, road density metrics, area statistics, and rural and urban population data according to the 2011 census. The dataset preserves the original structure from its source on Kaggle.
data(road_population_tbl_df)data(road_population_tbl_df)
A tibble with 36 observations and 27 variables:
Name of the state or union territory (character)
Length of national highways in kilometers (numeric)
Length of state highways in kilometers (numeric)
Length of district roads in kilometers (numeric)
Length of rural roads in kilometers (numeric)
Length of urban roads in kilometers (numeric)
Length of project roads in kilometers (numeric)
Total length of roads in kilometers (numeric)
Total area of the state/UT in square kilometers (numeric)
Density of urban roads (numeric)
Density of rural roads (numeric)
Road length per 1000 square kilometers of entire state (numeric)
Urban road length per 1000 square kilometers (numeric)
Rural road length per 1000 square kilometers (numeric)
Overall road density (numeric)
National highways road density per 1000 sq km (numeric)
State highways road density per 1000 sq km (numeric)
District roads road density per 1000 sq km (numeric)
Rural roads road density per 1000 sq km (numeric)
Urban roads road density per 1000 sq km (numeric)
Project roads road density per 1000 sq km (numeric)
Area of the state/UT (numeric)
Rural area in 2011 census (numeric)
Urban area in 2011 census (numeric)
Rural population according to 2011 census (numeric)
Urban population according to 2011 census (numeric)
Total population of the state/UT (numeric)
The dataset name has been kept as 'road_population_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/zsinghrahulk/india-roadforpopulation-data
This dataset, smartphones5G_tbl_df, is a tibble containing detailed information about 5G smartphones available in India as of 2022. It includes product names, processor details, camera specifications, display size, RAM, storage, battery, Android version, pricing from two different websites, the real price available, and scores by SmartPrice. The dataset preserves the original structure from its source on Kaggle.
data(smartphones5G_tbl_df)data(smartphones5G_tbl_df)
A tibble with 257 observations and 15 variables:
Name of the smartphone product (character)
Name of the processor used (character)
Rear camera specifications (character)
Front camera specifications (character)
Display size specification (character)
RAM size specification (character)
Storage capacity specification (character)
Battery specification (character)
Android version running on the phone (character)
First website for price reference (character)
Price listed on the first site (character)
Second website for price reference (character)
Price listed on the second site (character)
Actual available price (numeric)
Score assigned by SmartPrice (numeric)
The dataset name has been kept as 'smartphones5G_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/ramjasmaurya/5g-smartphones-available-in-india
This dataset, startup_funding_tbl_df, is a tibble containing detailed funding information for startups in India. It includes the serial number, date, startup name, industry vertical, sub-vertical, city location, investors' names, investment type, amount in USD, and any additional remarks. The dataset preserves the original structure from its source on Kaggle.
data(startup_funding_tbl_df)data(startup_funding_tbl_df)
A tibble with 3,044 observations and 10 variables:
Serial number of the record (numeric)
Date of the funding record in dd/mm/yyyy format (character)
Name of the startup (character)
Primary industry vertical of the startup (character)
Specific sub-vertical within the industry (character)
City where the startup is located (character)
Name(s) of the investor(s) (character)
Type of investment (character)
Funding amount in US dollars (character)
Additional remarks related to the record (character)
The dataset name has been kept as 'startup_funding_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/sudalairajkumar/indian-startup-funding
This dataset, Top500Cities_tbl_df, is a tibble containing demographic and literacy data for the top 500 cities in India. It includes population counts by gender and age group, literacy rates, sex ratios, graduation counts, and location information. The dataset preserves the original structure from its source on Kaggle.
data(Top500Cities_tbl_df)data(Top500Cities_tbl_df)
A tibble with 493 observations and 22 variables:
Name of the city (character)
State code (numeric)
Name of the state (character)
District code (numeric)
Total population (numeric)
Male population (numeric)
Female population (numeric)
Total population aged 0-6 years (numeric)
Male population aged 0-6 years (numeric)
Female population aged 0-6 years (numeric)
Total literates (numeric)
Male literates (numeric)
Female literates (numeric)
Sex ratio (females per 1000 males) (numeric)
Child sex ratio (females per 1000 males) (numeric)
Effective literacy rate total (numeric)
Effective literacy rate for males (numeric)
Effective literacy rate for females (numeric)
Location coordinates or description (character)
Total number of graduates (numeric)
Number of male graduates (numeric)
Number of female graduates (numeric)
The dataset name has been kept as 'Top500Cities_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/zed9941/top-500-indian-cities
This dataset, Unicorn_startups_tbl_df, is a tibble containing information about Indian unicorn startups as of 2023. It includes company names, sectors, entry valuations, current valuations, entry years, locations, and select investors. The dataset preserves the original structure from its source on Kaggle.
data(Unicorn_startups_tbl_df)data(Unicorn_startups_tbl_df)
A tibble with 102 observations and 8 variables:
Serial number (numeric)
Name of the startup company (character)
Business sector of the startup (character)
Entry valuation in billions (numeric)
Current valuation in billions (numeric)
Year of entry into unicorn status (character)
Location of the startup (character)
Select investors in the startup (character)
The dataset name has been kept as 'Unicorn_startups_tbl_df' to maintain consistency with the naming conventions in the IndiAPIs package. The suffix 'tbl_df' indicates that this is a tibble data frame. The original content has not been modified in any way.
Data obtained from Kaggle: https://www.kaggle.com/datasets/mlvprasad/indian-unicorn-startups-2023-june-updated
This function lists all datasets available in the 'IndiAPIs' package. If the 'IndiAPIs' package is not loaded, it stops and shows an error message. If no datasets are available, it returns a message and an empty vector.
view_datasets_IndiAPIs()view_datasets_IndiAPIs()
A character vector with the names of the available datasets. If no datasets are found, it returns an empty character vector.
if (requireNamespace("IndiAPIs", quietly = TRUE)) { library(IndiAPIs) view_datasets_IndiAPIs() }if (requireNamespace("IndiAPIs", quietly = TRUE)) { library(IndiAPIs) view_datasets_IndiAPIs() }
This dataset, WestBengalPop_tbl_df, is a tibble containing demographic data for districts of West Bengal, India, based on the 2011 Census. It includes total population, population increase percentage, sex ratio, literacy percentage, and population density for each district.
data(WestBengalPop_tbl_df)data(WestBengalPop_tbl_df)
A tibble with 23 observations and 8 variables:
Numeric district code (numeric)
District abbreviation (character)
Full district name (character)
Population in the year 2011 (numeric)
Population increase percentage in 2011 compared to the previous census (numeric)
Sex ratio in 2011, expressed as females per 1,000 males (numeric)
Literacy rate in 2011, expressed as a percentage (numeric)
Population density in 2011 (persons per square kilometer) (numeric)
The dataset name has been kept as 'WestBengalPop_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the IndiAPIs package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble object. The original content has not been modified in any way.
Data taken from the mapindia package version 1.0.1