This report discusses the development of TRAC's federal sentencing data. Follow this link for information concerning the contents of the current data.
The Transactional Records Access Clearinghouse (TRAC) at Syracuse University was launched almost a quarter of a century ago with a broad and apparently simple mission: provide the American people and the institutions of oversight with the authoritative information they require to independently judge the actual performance of the federal government.
TRAC's work developing judge-specific federal sentencing data began over two decades ago as part of a larger effort to develop comprehensive case-by-case, prosecutor-by-prosecutor, judge-by-judge details on all federal criminal and civil court actions. To achieve this goal, as well as the broader mission at TRAC, several distinct but inter-related activities have been needed.
Investigating Potential Data Sources and Obtaining Access to Them. To develop comprehensive data on federal criminal and civil court enforcement actions, over the past twenty years TRAC has mounted a series of initiatives seeking access to data collected both by federal agencies as well as by the federal courts. Different strategies have been needed to obtain federal agency versus federal court data.
For federal agency data TRAC has relied upon active, persistent and informed use of the Freedom of Information Act (FOIA) to obtain internal, timely and very detailed information from numerous federal agencies that are responsible for such critical federal activities as criminal and civil law enforcement, the regulation of immigration, tax collection, the award of disability benefits, and many other topics.
Since the federal judicial branch is not subject to FOIA requirements, for court records TRAC has had to rely upon the various mechanisms through which the courts have made their data accessible to outside researchers and to the public at large. Along the way a number of researchers within the judicial branch have assisted us, and directed us to useful data sources. On occasion they have provided files to us directly. Most recently the advent of PACER (Public Access to Court Electronic Records) has proven a great aid and one that TRAC has increasingly come to depend upon in compiling our data.
One of the early fruits of these efforts, begun in 1989, was TRAC's study of federal prosecutors. This effort relied on combining case-by-case files from the federal courts obtained from the Federal Judicial Center with similar files obtained through FOIA from the Executive Office for United States Attorneys (EOUSA) as well as employee-by-employee files from the Office of Personnel Management.
It took nearly a decade of additional research and painstaking development work to add judge and prosecutor names to these data. Late in 2002 we finished the development of our initial web interface that allowed others to access these judge and prosecutor-specific data and run queries on our computer servers. In the spring of 2012 we introduced a reporting tool that facilitates the examination and comparison of sentencing practices of federal district court judges. In the fall of 2012 we rolled out the second edition of this judge tool, and inaugurated monthly updates to TRAC's judge data to keep the information current.
Navigating Roadblocks, and Ensuing Litigation. As the center's name suggests, TRAC's focus has been on obtaining access to transactional data on criminal and civil cases, tracking the details on the day-to-day activities that take place. This type of information is found in the internal management systems maintained by federal agencies and the federal courts. TRAC has encountered many roadblocks along the way in pursuing our goal of building comprehensive transactional data files.
In the case of court data, it has been a matter of long-standing policy for the federal courts to strip judge identifiers from any data files that were released to researchers. As to federal agency databases, when we began most agencies had not been confronted with FOIA requests for these types of records and many actively sought to resist TRAC's efforts to obtain access. Even after access was granted, officials sometimes reversed course and started withholding the data covering subsequent time periods. It has been a challenge just to uncover what data the federal government keeps internally, and still requires ongoing active FOIA campaigns.
After unsuccessful efforts to obtain access to more detailed data on court cases kept by each U.S. Attorney Office, in March of 1998 TRAC's co-directors filed suit to test the legality of this withholding under FOIA. This test case sought access to the database records from two U.S. Attorney offices [Susan Long and David Burnham, Transactional Records Access Clearinghouse Syracuse University vs. United States Department of Justice, Civil Action 5:98-00370 (N.D. N.Y.)] During the course of this litigation we uncovered evidence that the data we sought also were being improperly destroyed. This was occurring even though the data files, because of their historical importance, were classified for permanent retention by the National Archives and Records Administration (NARA). Accordingly, we filed a second lawsuit joining with our pro bono attorneys at Public Citizen seeking to ensure preservation of these records [Susan Long, David Burnham, and Public Citizen vs. Janet Reno (Attorney General), Executive Office for United States Attorneys, and John Carlin (Archivist of the United States), Civil Action 1:98-01855 (D. D.C.)].
As a result of this litigation, the government ultimately agreed to preserve all database files. It also released the records we sought in our first suit, and provided assurances that it would continue to release the same — as well as updated — data from all U.S. Attorney offices in the future. The ink was barely dry on this stipulated settlement, however, when the Executive Office for United States Attorneys (EOUSA) reversed course and refused to provide any updated information on the court cases it had already released. Further, the EOUSA also refused to release any data for the remaining U.S. Attorneys. Following this abrupt reversal, the EOUSA began delaying and then stopped providing the central files that TRAC had been regularly receiving which were needed to keep our existing data current.
Unable to obtain access to the data we needed, and that the Justice Department had agreed it would provide, we returned to court in 2000 for a third time [Susan B. Long and David Burnham vs. United States Department of Justice, Civil No. 1:00-cv-00211 (USDC, DC)]. In the long course of this litigation, and a related FOIA lawsuit filed in 2002 [Susan B. Long and David Burnham vs. United States Department of Justice, Civil No. 1:02-cv-02467 (USDC, DC)] the government (as a result of repeated inaccuracies in the declarations it had filed) was ordered by the court to do a very thorough search for the existence of all records sought in the case. This so-called Vaughn Index disclosed a treasure trove of previously hidden internal government data files, all of which ultimately were released to TRAC. In addition, as a result of the litigation, TRAC is now able to obtain updated records on a regular monthly cycle. As an outgrowth of this litigation, we are also now receiving the comprehensive data maintained in each U.S. Attorney office that we originally sought in our 1998 lawsuit. Cross-appeals on a few specific data fields remain pending before the Court of Appeals.
TRAC has also been successful in getting data from the Criminal Division, the Civil Rights Division, the Environmental and Natural Resources Division, and the National Security Division (although new roadblocks continue to crop up from time to time). After a number of years of fruitless efforts, TRAC filed suit in 2004 to obtain access to the Civil Division's internal database records on court cases [Susan B. Long and David Burnham vs. United States Department of Justice, Civil Action 5:06-1086 (N.D. N.Y.)]. As a result of this suit, the Civil Division is now regularly providing TRAC access to its internal database records. A few issues concerning release of some specific data fields remain under litigation in that case as well.
Obtaining Documentation. Once files arrive at TRAC's office, the next challenge is transforming them into usable research data. Data obtained through FOIA from federal agency internal management databases typically arrive with little if any documentation of what they contained — even as to what fields of information were included. Under FOIA, agencies are under no obligation to include such information, to discuss what was in the files, or to answer questions about them. Sometimes agencies are helpful, frequently they are not. And the hard truth is that even when officials want to be helpful, the type of documentation scholarly researchers want and need typically hasn't been compiled by the agency, since these databases were never created with that use in mind.
Thus obtaining access to the raw files we use to build our data is only part of the battle. To piece together the information about the files, it has been necessary to file additional FOIA requests to obtain the records that may exist and that could help to document the data — computer program listings, system requirement packages, other technical documentation, database schema stored as part of the database itself, user manuals, the terms of government contracts, the reports delivered under these contracts from outside firms that may have helped develop or maintain the government's data, and so forth. FOIA litigation to obtain some of this information has also been required.
Unfortunately, obtaining supporting documentation is an ongoing process. Federal databases aren't static; rather, they change with the evolution of computing and database systems, as well as with fluctuating agency needs. Just when one database system is mastered, it changes, creating another mystery to solve.
Transforming Raw Files Into Usable Information. Once we receive the data and supporting files we undertake extensive tests to determine whether the data appear to be sufficiently reliable to use. A standard test we employ is the following: Using the data, can we reproduce the figures which have been cited in the government's own internal or published reports, press releases and testimony before Congress? If there are discrepancies, we investigate what accounts for these differences and what this implies about the reliability and usability of the data. We also examine the records to look for internal consistency in the information. For example, if a case records that "A" occurred on a specific date, are there other fields that record information that are consistent with this, or does some of the information seem to contradict this? To facilitate this data validation, work we also aggressively use FOIA to obtain a comprehensive picture of how the government records and maintains this data, including its quality control efforts.
If the data pass our validity tests, we can then begin to develop the data for use. This involves an additional series of steps to develop master databases, which often requires combining the data with other sources to enhance their usefulness. For example, detailed research in court records is needed in order to add prosecutor and judge-specific details to each case. A master database containing all sections and subsections of the entire U.S. Code is used to parse and classify information on each charge. In some cases, additional research is needed to resolve classification issues manually. Data obtained from federal agencies are combined and then compared with court records and any conflicts found are investigated.
We also create a variety of useful indices that summarize agency accomplishments, as well as a variety of performance measures. This work can be time-consuming to research and determine: What needs to be measured? How best can the desired measure be made operational? Often in addition to raw counts per capita, figures are developed to allow the comparison of federal districts because they vary greatly by size. TRAC developed (and maintains, as census boundaries change) a cross-walk of the boundaries of each federal district to census records in order to compile census population and other statistics for each district. Related statistics such as percentages, averages, medians, constant dollar figures, and rankings are then added.
Work does not stop once a master database is completed. Currency of data is important to maximize their usefulness in monitoring government activity. We employ a monthly update cycle where the entire process is repeated with the new records that are added. As government recording systems evolve, this process must also continually monitor the data for changes in the definitions of categories used. When changes are detected, research is required and decisions needed on how best to ensure definitions across time are as consistent as possible for accurate monitoring of enforcement and litigation trends.
 Apart from the institutional support provided by Syracuse University, which provides TRAC with office space on campus, TRAC's work relies almost entirely on support from foundation grants, gifts, and in-kind donations of legal services and faculty time. To preserve our independence and avoid conflicts of interest, we do not accept funding from the federal government. To support our public service mission TRAC also provides web access plus data querying and reporting tools to permit others to use our data warehouse to conduct on TRAC's servers their own analyses of databases we have developed. Some of the costs of providing these services are offset through subscriber fees. We also support a TRAC Fellows program to allow scholars to conduct their independent research utilizing the data we have developed.
 Some of the transactional data on government activities that TRAC gathers focuses upon agency FOIA decisions themselves. In 2011 TRAC launched foiaproject.org, a daily-updated website that provides free access and a variety of search tools on each new FOIA court filing and decision as soon as they occur. Funding has recently been received to begin expanding our access to include — agency-by-agency — handling of initial FOIA requests. Upload facilities allow the public to share commentary and federal agency responses to their FOIA requests. Both TRAC Co-Directors were appointed to the National Freedom of Information Act Hall of Fame in 2006 for their lifetime activities on public transparency. In 2012, Co-Director Long also was awarded the FOIA Robert Vaughn Legend Award by the Collaboration on Government Secrecy at the Washington College of Law, American University.
 TRAC's October 1989 report examined the work of federal prosecutors in eleven United States Attorney offices in the handling of both criminal and civil cases brought in the federal courts over the decade of the eighties. A second report published the following year extended coverage to all U.S. Attorney offices and focused on staffing levels. (The study found wide differences in the relative staffing levels across federal districts and led to congressional hearings and heightened attention within the agency to improve the allocation of attorneys among districts to reduce unwarranted disparities.) A third report brought out in three volumes in 1991 focused on federal environmental litigation from referral to eventual outcome and matched up case dates with the precise dates U.S. Attorneys served to document how they exercised their discretion on which cases to pursue.
 For judges, see: http://trac.syr.edu/tracreports/judge/judge_medtimeG.html (February 4, 2003) and for federal prosecutors see: http://trac.syr.edu/tracreports/pros/ausa_pctdecG.html (January 24, 2003).
 Efforts by one of TRAC's co-directors to obtain access to computer data files actually date back more than four decades ago, when the law was unsettled on FOIA coverage of electronic records. She won one of the first court decisions finding "computer data tapes" subject to FOIA. See, Long v. Internal Revenue Service, 596 F.2d 362 (9th Cir. 1979), cert. denied, 446 U.S. 917, 100 S.Ct. 1851, 64 L.Ed.2d 271 (1980). Also see, Long v. Bureau of Economic Analysis, 646 F.2d 1310.
 TRAC's efforts, of course, began in the era of large mainframe computers and open reel tape storage systems. As technology and computer systems have advanced, so too have database systems. And throughout the last twenty years, government has also proven adept at assigning new names and acronyms to their data systems providing a continuing game of hide and seek. For example, immigration cases have grown to over half of the federal criminal caseload. After the formation of the Department of Homeland Security (DHS) a major effort began to integrate the various agency legacy database systems. When data that had formerly been publicly available migrated to the new Enforcement Integrated Database (EID) it suddenly became unavailable. In October 2012 TRAC filed suit against Immigration and Customs Enforcement (ICE) challenging its ruling that this master repository which it operates for the DHS is off-limits to the public. See Long and Burnham v. Immigration and Customs Enforcement, Civil Action No. 1:12-01725 (D. D.C.).