Before Searching....

The data sources listed in this guide (and all the ones to which Harvard Library subscribe) generally describe the data they contain as being summery or micro or codebooks. If these words are unfamiliar to you, here are some data definitions:

  • Summary-Level Data: Summary-level data are published data points in either print or electronic format. You would use summary-level data if you were looking for a quick statistic such as the unemployment rate for the current month or if you wanted to see a table of statistics, such as GDP for various countries during a specific time period.
  • Micro-Level Data: Micro-level data files are the numerically-coded results of individual responses to such files as the census questionnaires, public opinion surveys, etc. You have much more flexibility to work with the data and run statistical analyses on the extracted data. The data are in an unanalyzed, raw format of columns and rows, usually in ASCII format but not always. Some raw data files are accompanied by files in SPSS, SAS or other statistical software format for easier use in these packages. If you are working with only the raw data, you must consult the data documentation (codebook) and write a small program or use an extraction program to have the computer "read" in the data into a useable format.
  • Data Documentation/Codebooks: Codebooks provide information on the structure, content, and layout of a data file and the questionnaire, if any, used for the survey or study. Many codebooks are available electronically with the data file.