Q. Searching for data in ICPSR

Answered By: Andrew Greenman
Last Updated: May 18, 2022     Views: 25

ICPSR is the Inter-University Consortium for Political and Social Research, a major social and human science dataset repository and a common place for students and researchers to find datasets.

Please see also the subject guide on Accessing ICPSR datasets at AU for information about setting up an account to download data and accessing ICPSR off-campus. AU CTRL also has a video on using ICPSR, including account creation, downloading data, and using the online analysis tool, which are not covered in this FAQ.


What can I search for in ICPSR?

You can use the ICPSR search box to search study names and descriptions, variable names and descriptions, collections of related datasets, publications that use data contained in ICPSR, and ICPSR webpages. You can choose which of these you search by selecting the corresponding heading on the search results page, or by selecting it in the “Find Data” drop-down menu.


How do I search in ICPSR?

The keyword search in ICPSR is very similar to searching other library databases! The search system accepts Booleans and has a long list of powerful filters you can use to narrow down results to what you’re looking for.

Individual terms are automatically considered to be joined with an AND operator. For example, entering home ownership would be interpreted as home AND ownership. If you want to search for the specific phrase, you can use quotation marks around the phrase like you would in many other search systems: “home ownership”. Truncation/stemming also happens automatically, you don’t need to add a * to the word stem.

The search results page has two major sets of options you can select. Running horizontally across the top of the results list are links to select what sort of items you want to search.

“Studies” means the system is searching description pages of studies. You can toggle the “summaries” button on the results page to display brief abstracts of the study’s purpose, methods, and results.

Some of the most useful filters on the left-hand side of the results page will include "Restriction Type", "Geography", and "Collection method." Restricted data will be covered more at the end of this FAQ, but for most student projects, you will want to select "Public Use" in this box. These filters can quickly eliminate extraneous studies or ones you cannot access and narrow the list down to just relevant items.

The study page will describe the study, link to the dataset(s), and provide a copyright license and other information about reuse. The “Variables” tab on the study page will show the variables in the study (and none outside of it). The “Data & Documentation” tab almost always contains the study’s codebook, often in a folder with other documentation. The codebook is crucial to figuring out if a dataset will work for your use case. If you’re not familiar with reading codebooks, the Princeton Firestone Library has an excellent introduction. If a study has been used in a paper, conference presentation, or other format that information will be available on the “Data-related Publications” tab.

“Variables” will show individual variables with descriptions matching your keywords. This a useful but sometimes tricky way to find data that fits your needs. The variable descriptions will be displayed in the search results page. Variables cannot be filtered in as many ways as studies can.

The "Dataset" column on the right can generally be ignored - it refers to the internal dataset number, many variables from many different studies will all have "DS 1" that column. However, you can expand the "Study" filter dropdown to see all studies that have variables in which your search terms were found.

You can select multiple variables with the tick boxes on the left side of the individual result and then click “compare” towards the top of the results and you’ll be taken to a screen showing relevant information about your chosen variables, including the study they came from, the time period of that study, and the full variable description text.

The difficulty with variable searching is that a perfect variable may not have been accompanied in the original data collection by other variables you need for your intended analysis. Variable searching will require going into the study page and looking through the codebook or variables list to see if the dataset’s content and structure is suitable for your purposes.

“Series” searches the description pages of grouped related studies. Series may be grouped by being iterations of the same study, studies by the same lab, or studies from the same funder. This can be useful for finding something like all of the ANES studies in one place, rather than having to sort search results to eliminate other studies using terms like "national", "election", and "american."

“Data related publications” searches ICPSR’s “bibliography” (it’s an index with citations, no guarantee of full texts) of works created using data stored in ICPSR. This index includes journal articles, books, and gray literature. This can be useful for finding a paper that seems relevant to your goals and then backtracking to find the corresponding dataset in ICPSR, or for finding examples of how a dataset was used previously.


ICPSR Subject Thesaurus

ICPSR uses a controlled vocabulary to describe items. This means that it attempts to standardize the terms used to describe subjects. In general, searching for the term ICPSR uses to describe a subject will increase the recall of your search. You can find it, and more information about how it was created and its purpose, on the ICPSR: Subject Thesaurus page.


The dataset I’m interested in says it’s restricted, what does that mean?

Restricted-use data has been evaluated by ICPSR curators and found to have information that would identify study participants that cannot be removed without hindering the analytic value of the dataset. Accessing restricted data requires an application, sometimes IRB approval, and a terminal-degree-holding PI affiliated with a research institution sponsoring the request. However, documentation and supplementary materials such as codebooks and empty questionnaires are likely to be openly available – just not identifiable data.

Student projects should use public-use data. It is extremely unlikely that a student request for access to restricted data for a class assignment would be approved.

Please see “Accessing Restricted Data at ICPSR” for more information. Additionally, you should contact AU’s ICPSR representatives for assistance with your application.


If you’re interested in learning more about using ICPSR, the ICPSR youtube channel has many videos about the platform. “Finding and Accessing Data at ICPSR” covers most of what’s in this FAQ and more.

Related Topics