Social Security Awards Depend More on Judge than Facts
Disparities Within SSA Disability Hearing Offices Grow
A court-by-court analysis of close to two million Social Security Administration (SSA) claims has documented extensive and hard-to-explain disparities in the way the administrative law judges (ALJs) within the agency's separate hearing offices decide whether individuals will be granted or denied disability benefits.
These findings — discussed in detail below — suggest that in many SSA hearing offices today, the chance a disability claim is granted or denied is often determined more by the particular judge assigned to handle it than by the facts and circumstances presented in the case.
The findings further document that the problem is not simply the result of a few judges whose decisions are far out of line with those of other judges on the bench. Rather, the agency's own case-by-case evidence demonstrates that the problem is systemic. To a surprising extent the records on disability decisions show again and again that even within the individual offices there is not a clear consensus among the judges about which claims should be awarded versus which should be denied.
In addition to determining the current extent of the disparities within the SSA's court hearing offices, the data also show that the problem today is somewhat worse than it was four and a half years ago when the agency launched a broad program to reduce the vast number of individuals awaiting a hearing and to speed up the processing of cases.
The first report in TRAC's two-part series examined the impact of these program initiatives on hearing backlogs and how quickly cases were decided. This report traces what has happened to the decisions themselves. The core question asked here: is the agency now doing a better job in fairly deciding when to award a benefit, and when to deny one, than in the past?
SSA Statement on TRAC's Findings
On July 4, 2011, the SSA released a statement sharply criticizing this study on several different grounds. A point-by-point comparison of the SSA's criticisms and TRAC's report indicates that the agency's objections are unjustified in light of the what our report actually says.
A comparison of cases handed down in all SSA offices between the FY 2005-2006 and the FY 2010-2011 periods shows that during this span of years the problem of decision disparity grew. For a typical office, disparity increased by 10 percent. But the record in individual offices across the nation varied sharply, with some of them undergoing substantial deterioration. Among offices with the greatest disparity increases were those in Jackson, Mississippi; Morgantown, West Virginia; Greenville, South Carolina; San Jose, California, and Milwaukee, Wisconsin.
These findings about the operation of what arguably is the largest such court system in the world are based on a new and extremely detailed analysis by the Transactional Records Access Clearinghouse (TRAC) at Syracuse University (see About the Data). With this analysis the public can obtain counts and rates for the SSA's many separate offices, as well as information about the performance of all of its many hundreds of semi-autonomous administrative law judges.
Administrative Law Judges
The position of the administrative law judge (ALJ) was established within the federal government by the Administrative Procedures Act of 1946.
The Stakes Are High
The cases handled by the SSA hearing offices, as discussed in our previous report, largely focus on applicants who are seeking disability payments under two major federal government programs: social security disability insurance (SSDI) and supplemental security income (SSI). In just the 18 month during FY 2010 and the first six months of FY 2011, the SSA hearing offices dealt with 1,124,346 of these matters.
The goal of developing a system that is both effective and fair raises a very challenging question: is the agency delivering timely benefits to those whom the safety net was designed to reach, while at the same time denying benefits to those who don't qualify?
According to the SSA, "a 20-year old worker has a 3-in-10 chance of becoming disabled before reaching retirement age." Not even counting spouses and children of disabled individuals, the number of people directly impacted by how well or how poorly this system functions is vast. Over 10 million individuals last year were receiving disability benefits under the social security disability insurance fund, up 53 percent in the last decade. An additional 7.7 million received benefits from the supplemental security income program, up 21 percent over the same period.
About the Data
TRAC assembled data from a number of sources for this study, publicly available information as well internal data obtained by TRAC and others from the agency under the Freedom of Information Act.
Not only are there millions of individuals impacted directly through how well or how poorly the system meets these challenges, but the stakes in dollar terms to the taxpaying public are also huge. Federal expenditures for the programs last year were over $170 billion. At a time when entitlement programs are being scrutinized because of their rising costs, the disability insurance program is at the leading edge of the crisis. It is one of the fastest growing entitlement programs, and its trust fund is slated to run out in 2018, just a few years away. And because being found to be entitled to receive disability benefits also opens the door to support from other programs, including early coverage by medicare, the potential sums of money involved are much greater than the $170 billion figure suggests.
In addition to the disabled and the American taxpayers, there are many special interest groups whose immediate interests are affected in the debate over what should be done. Here are two examples:
How Disparities Are Documented
To undertake this study, TRAC first identified all of the 1,422 judges who had each decided 100 or more disability appeals during the last 18 months. (This step ensured that only the records of judges who had made a sufficiently large number of decisions were examined.)
Next, for each of these selected judges, TRAC calculated the number of instances when they granted or denied a disability claim and determined the proportion of such actions in relation to the total of all their decisions. If a judge handled cases in more than one hearing office during the period, the judge's record in each hearing office that met this 100 decision standard was examined separately.
Because federal law requires that as far as possible the flood of cases coming to a hearing office be assigned on a random basis — the statute specifies "in rotation" — the general mix of cases assigned to each judge in a specific court should be roughly similar. Since each of these judges decides a large number of cases, this means that if the judges are applying similar standards to the incoming matters, then the proportion of claims granted by each should be roughly the same for all judges within that office.
Or put another way: to the extent judges' grant or denial rates depart from one another within the same hearing office, these differences cannot be explained on the basis of the worthiness of the cases handled by individual judges. One or more other factors — such as the pre-existing predilections of the judge assigned — must be the source for these disparities.
Similar sorts of comparisons should not be made between judges from two different offices. For instance, the makeup of the cases coming to a particular SSA office in Brooklyn may naturally be expected to differ from the make up of those in Seattle or in other offices. Thus, there is no reason to expect Judge A's grant rate in Brooklyn would necessarily be the same as Judge B's rate in Seattle. Any differences might be attributable simply to the mix of the "worthiness" of the cases between the two offices. For example, grant rates nationally differ depending upon whether benefits are sought under the disability insurance or the supplemental security income programs. If a larger proportion of one type than another come into a particular office, this will naturally have an impact on expected outcomes.
For this reason, to improve the comparability of the cases being considered, TRAC limited the comparisons to the cases heard by judges assigned to the same office. These comparisons were then made for each of the 155 current hearing offices across the nation, to assess the level of consensus in the decision-making process for each one (offices with fewer than four judges who had each decided 100 or more cases were excluded).
The evidence collected by this study allows the public to examine whether or not this very vital court system delivers "equal justice under the law," or delivers outcomes more by the luck of the assignment rotation that determines which judge hears the case. The goal here is not to assure that each judge in the nation is making the same decisions but to determine the extent to which their decisions are being reached in a reasonable and principled fashion. This is difficult if there isn't agreement on the standards.
What Are the Disparities?
To better understand the broad problem that faces the country as a whole, it helps to focus on one location. For example, during the very recent October 2009 — March 2011 period, the Texas SSA office in Dallas (North) had 15 administrative judges on its staff who had each decided 100 or more cases. As shown in Figure 1, the SSA records indicate that the judge grant rates in this single location ranged across a full spectrum: from less than 10 percent being granted to over 90 percent. Yet, as discussed previously, because the incoming cases are by law assigned in rotation, the kinds of matters coming to each judge should have contained approximately the same number of creditable claimants.
Yet among this total of 15 ALJs in Dallas were three judges who granted 30 percent or fewer of all of the cases they processed. And in the same period, there also were three others who granted 70 percent or more of their cases. Even the larger group of 9 ALJs that clustered in the middle of the range shown in Figure 1 had grant rates ranging between 43 percent and 68 percent, still a fairly wide range. The bottom line: in this office, claimants who were assigned to any one of the judges with higher grant rates — regardless of the facts of their particular situations — were far more likely to have their claim granted than if they had been assigned to a judge that denied claims more frequently.
Using several different indices, TRAC's assessment of the grant and denial rates still found wide disparities in court after court. For the first disparity measure, the difference between the grant rates of judges at the top and the bottom of the decision scale in an office were examined. For example, if the grant rate for the judge with the highest grant rate was 80 percent and that for the judge with the lowest grant rate was 30 percent, then the disparity index for that office would be 50 percentage points. (Because grant and denial rates are mirror images of each other, the disparity index will be the same whether grant rates or denial rates are used.)
Table 1 provides a first-of-its-kind interactive display which lists each of the hearing offices, ranked by the degree of disparity in judge grant rates for that office in the FY 2010-2011 period. By selecting a "judge" table you can also compare the grant rate and the number of decisions made for each administrative judge within that office, along with that hearing office's corresponding averages.
Based on the figures from the interactive display in Table 1, Figure 2 presents a picture of how the disparity levels in the 155 hearing offices stack up. The SSA records show that the affected offices are not just an isolated few but that the problem is extremely widespread. In fact, for a vast majority of offices (124 out of 155), the disparity was at least 30 percentage points; for some the rate was much higher.
Only seven out of the 155 courts had a disparity index of under 20 percentage points — that is, less than 20 percent difference in the grant or denial rate between the highest and lowest judges ranked within an office. And in no office was the difference less than 10 percentage points. Sixteen offices had disparity levels of 60 percent or higher.
Table 2 presents information about the ten hearing offices with the most and the ten with the least disparity, along with basic data about the hearing office that was exactly in the middle — that is, half of the hearing offices had higher disparity levels and half had less. And even for the judges assigned to this middle of the road office in Alexandria, Louisiana, the differences were extensive: from a grant rate at the low end of 28.1 percent to a high of 70.3 percent. In this case the disparity index (the difference between the highest and lowest judge grant rates) was 42.2 percentage points. Expressed another way, the odds of granting benefits were two and a half times greater if a case was assigned to the judge with the highest rather than lowest grant rate in this office.
Table 2. Hearing Offices by Judge Decision Disparity, FY 2010 - FY 2011
To further examine the problem, TRAC undertook a second way of assessing whether the disparity score resulted from a few "outlying" judges — each having very high or low grant rates relative to the remainder of the judges — or whether judges more generally differed in their grant rates within that hearing office. Because the first measure focused on the outlying judges with very high or low rates, it does not reveal how the remaining judges' grant rates varied.
The second measure differs by examining the decisions of the judges in the "middle range" of each court. Here our focus is on whether these judges are in close agreement as reflected in their grant rates. To undertake this analysis the 25 percent of judges with the highest grant rates and the 25 percent with the lowest rates were put aside. This left the 50 percent of judges whose grants rates were in the middle. For example, if a court had 12 judges we sorted them by their grant rates and turned our attention to the 6 in the middle. If among these 6 middle judges the highest grant rate was 55 percent while the lowest grant rate was 45 percent, then our "middle range" disparity index would be 10 percentage points.
If the judges in the middle group are in close agreement, the second measure for them will be small even though the office's overall disparity rate may be high when it was measured by the differences between the judges at both ends of the scale. The question is: to what extent did the rates among this middle group actually vary?
We did not find close agreement when we examined each of the offices using the second "middle range" measure. In fact, in only 17 out of the 155 hearing offices — (11 percent of the total) — were the decisions by the judges within 10 percentage points of one another for these "middle" judges.
In only one hearing office, Atlanta (North) in Georgia, were the center judges within 5 percentage points of one another. Curiously, at the other end of this scale was another Georgia office, Macon, where the grant rates of the center judges differed among themselves by 43.4 percentage points.
The typical (median) difference was 17.8 percentage points, but often ranged much higher. In addition to Macon, Georgia that had the highest middle range disparity (43.4%), other offices with the highest scores were: Miami, Florida (38.9 percent); Columbus, Ohio (38.4 percent), Dayton, Ohio (36.0 percent) and Eugene, Oregon (35.7 percent). Table 3 lists the top 10 offices with their overall and their middle range disparity scores.
The study also found a clear relationship which was highly statistically significant between office rankings on the two measures. A total of 60 out of the 78 offices that ranked in the top half of offices on their overall disparity scores, also ranked in the top half on the middle range disparity scores. Or stated another way, offices that had greater overall disparity also tended to have lower consensus in grant rates even among the middle 50 percent of their judges (see Table 4). The complete list of each hearing office's overall and middle range disparity scores are available here.
Thus, offices that had large differences between the highest and lowest grant rates of the judges also showed low relative consensus among other judges serving in that office. And for the vast majority of offices, there was not close agreement in grant rates even for the middle range of judges. The bottom line: the current problem is not limited to a few out-of-line judges at either end of the scale. To be effective, attempts at reform should not treat this as if it were simply a "judge" problem. The real problems run much deeper and appear systemic.
How Did SSA Changes Announced in 2007 Affect Disparity?
In 2007 — facing rising hearing backlogs and lengthening wait times — the SSA launched an ambitious project to bring the backlogs down and to speed up the processing of these cases. TRAC's first report in this two-part series focused on whether, based on these announced goals, the agency's initiatives had met with success.
In focusing more heavily on speeding up the process, however, the various program changes encompassed in the 2007 initiatives paid little attention to ensuring the quality of the agency's decisions. As shown above, TRAC's detailed analysis of the SSA's recent operations has found that in many ways it is dysfunctional, often delivering outcomes more influenced by the identity of the judge than by the facts and circumstances of a particular case.
This is not entirely new. In fact, records developed from a series of scholarly, media and government reports going back for many years indicate that decision disparities have been a long standing problem of the SSA disability court. The question therefore is: in the last few years since the ambitious new project was announced in 2007, have decision disparities been reduced, or has the situation worsened?
To answer this question, TRAC analyzed the extent of disparities in the grant/denial rates of the judges within each SSA hearing office during fiscal years 2005 and 2006 — just prior to the launch of the current improvement plan — and compared disparity levels then with the situation during fiscal years 2010-2011. A total of 134 offices which existed during both time periods could be compared.
Again, two measures were calculated for each hearing office. As before, the disparity score measured the difference between the highest and lowest grant rates among judges within the same office. The "middle range" score assessed the difference between the grant rates for the middle half of the judges in an office. Again, only judges making at least 100 decisions during the FY 2005-2006 period were considered, and a hearing office had to have at least 4 judges to be considered.
Then each office's disparity scores for the FY 2005-2006 period was compared with that same office's current disparity score from Table 1 to see if the extent of disparity had increased, stayed about the same, or decreased during this time span. A similar comparison was made for each office on its previous and current "middle range" scores to see if these had risen or fallen during this period.
As we noted in our first report, an important part of the SSA changes announced in 2007 was a major campaign to hire a large number of new judges. The SSA also established a series of new offices. As a natural result, there were many transfers and with the passage of time a fair number of the judges retired or took on new duties.
With this much change in the makeup of the judges serving in an office, some variation in the disparity would naturally be expected to occur over time. The arrival (or departure) of a new judge with a particularly high (or low) grant rate vis-a-vis those judges serving in an office could raise or lower the measured disparity level of that office. So another question asked in this study was: against this expected variability in office disparity scores across time, was there any discernible systematic shift in the level of disparity either up or down?
There was. Many more offices experienced an increase in disparity than showed a decrease. In fact 56 percent more offices had increasing disparity than declines — 67 with increases versus 43 with declines. The rest stayed about the same (within 3 percentage points). The 10 hearing offices with the sharpest increase in disparity are listed in Table 5.
As shown in Figure 3, we also found that the typical or median disparity level on each of our two measures increased between FY 2005-2006 and FY 2010-2011. For the FY 2005-2006 period the median disparity in these 134 offices was 40.0 percentage points. By the FY 2010-2011 period it had grown to 43.8, or an increase of 10 percent. The "middle range" differences for our second measure showed a smaller increase of 4% (17.8 to 18.5).
Thus, while the agency's major focus in recent years has been on speeding up the decision process, the quality of decisions appears not to have improved but to have deteriorated.
This report documents very serious problems in the operations of a major federal agency with a vast impact, for good and ill, on the lives of millions of individuals and the economy of the United States. These problems demand serious attention. But before the repairs begin, it is essential that the Obama Administration, the SSA, Congress and the other players better understand the full dimensions of the challenge before them.