The Internet is a dynamic place. While that is a benefit when we want new information quickly, it is a challenge in the legal field when we need a reliable, stable platform on which to build our arguments or conduct research. This backdrop is where the Wayback Machine comes in. The Wayback Machine is a service provided by the Internet Archive, a non-profit digital library dedicated to the preservation of our digital culture, which contains more than 26 years of archived web pages available for our viewing. This resource can serve our profession in two ways: first, as a tool for researching information that has disappeared from the public-facing internet; and second, as a tool for providing a stable link for use in briefs and motions. There are for-profit entities out there that will charge you for a similar service, however the Internet Archive is free, and has a proven track record of stability.
How can this be useful? Let’s say that part of your theory of defense hinges on Snapchat’s Privacy Policy in place on June 8, 2019, but if you visit Snapchat’s website at the time of this writing, you will see that the current privacy policy was updated in 2022.
Once you have selected the correct site, the Machine will take you to a timeline and calendar where you can select the capture for the desired date and time. First, click on the correct year on the timeline, in this example, 2019.
Next, select the date and capture time you want from the calendar. A quick note, captures and dates that are blue are better than green, so go for those if possible.
The Wayback Machine will now load snapchat.com as it looked on June 8, 2019, at the time you selected.
From here, you can navigate to the Privacy Policy and view it as it was on June 8, 2019.
To save it, you can either print it, capture it with software like WebPreserver, or link to it via the Wayback Machine. For more information on this last method, see the next section.
Stable Links for Citation
Citation to internet sources in motions or briefs can be a tricky thing. Sure, the Bluebook can tell you the “proper form” for an internet citation, but no amount of spading today guarantees that a link will work tomorrow. Not only can a website change its structure, rendering the link dead, but the site itself could disappear, taking all its data with it. The Wayback Machine can help.
In the exercise above, we located the Privacy Policy for Snapchat.com from June 8, 2019. If you needed to incorporate this page into a brief, you could either print or capture it as a PDF and attach it as an exhibit, or you can cite to the Wayback Machine’s version. This type of use is encouraged by the Internet Archive. After you arrived at the Snapchat’s privacy policy, the address bar shows an exact link to this version of this page.
Simply copy and paste the URL from the address bar into the citation in your brief. Now, when the court glowingly quotes your winning argument in a ruling, future lawyers reading it on Westlaw in ten years can click and read the original source material without encountering a dead link.
But what if the information you want is not on the Wayback Machine yet? Perhaps the website hasn’t been recently archived, or worse, has never been archived? You can trigger the Wayback Machine to take a snapshot of a page on demand, which will give you a stable link to the information you want for citation. To trigger a capture, go back to the Wayback Machine homepage (https://web.archive.org). Instead of entering a query in the search box, enter the URL you would like to preserve in the “Save Page Now” box.
For instance, this blog had not been archived since January. I entered https://nlsblog.org into the “Save Page” box, and told it to save. After a page where I confirmed what I wanted, The Wayback Machine got to work:
Now the Wayback Machine has a current snapshot. To copy the link, right-click on the “Visit Page” link and select “Copy Link Location” or visit the page itself and copy the URL from the address bar.
As a warning, the Wayback Machine will not work with all websites. Some sites use special settings (robots.txt) to prevent automatic capture or crawling by sites by search engines. For example, individual Facebook profiles are not available. A good rule is if you can’t find it with Google, you probably won’t find it on the Wayback Machine.
Conclusion
As I said in the beginning, the Internet is a dynamic place, but we do not have to let it stop us from finding the information we need or cause us to worry about the citations in our legal arguments. The Wayback Machine can be a blissful island of stability in an ever-changing world. Cite with confidence.
eDiscovery, or electronic discovery, is the process of identifying, collecting, and analyzing electronically stored information (ESI) in order to be used as evidence in legal cases. This process can be time-consuming and costly, as it often involves manually reviewing large amounts of data. However, advances in artificial intelligence (A.I.) have opened up new opportunities for streamlining the eDiscovery process. One such technology is ChatGPT, a large language model developed by OpenAI.
ChatGPT is a powerful tool for natural language processing (NLP) that can understand and generate human-like text. This makes it an ideal candidate for use in eDiscovery, as it can quickly and accurately analyze large amounts of ESI in order to identify relevant information. For example, ChatGPT can be used to identify specific keywords or phrases within a document, classify documents by type, or even summarize the content of a document.
The introductory paragraphs above were generated by ChatGPT in response to a request to write a blog post on ChatGPT and eDiscovery. This is an example of how ChatGPT can generate text in such a way that one cannot immediately tell whether it was written by a machine or human. This blog post will provide initial takes on what the potential ramifications ChatGPT and similar Artificial Intelligence (A.I.) tools can be for the work CJA panel attorneys and federal defenders do. It is not advocating any specific position regarding A.I. technology which has wide ranging and yet to be realized implications in many fields. The goal is to provide a general idea of how this new A.I. technology might impact our work.
What is ChatGPT?
The current version of ChatGPT, 3.5 was released in late 2022 (openai.com/blog/ChatGPT). It is an artificial intelligence tool built on a natural language processing model known as a Generative Pre-trained Transformer (‘GPT’) or ‘generative A.I.’ developed by OpenAI. ChatGPT is great for generating human-like text to help solve problems. This can include answers to questions, summaries or translations of large volumes of text, generating lines of code, or providing step-by-step, conversational instructions for a wide range of complex software applications.
ChatGPT is trained on a massive corpus of datasets including many publicly available domains on the internet including Google, the Wayback Machine, Github, WordPress, Wikipedia, and so forth. However, it is not connected to the internet in real time and has limited knowledge of world and events after 2021. This means it can occasionally produce inaccurate information, a problem that OpenAI acknowledges help.openai.com/en/articles/6783457-chatgpt-general-faq. In some instances, it will tell you it doesn’t know, sometimes it will provide an answer with a disclaimer. It can also provide an authoritative sounding answer that is wrong without any qualifier. It has even been known to fill in the gap with made up information. For example, eDiscovery expert Ralph Losey asked the robot to identify the top five eDiscovery cases for 2022. Since it did not have any 2022 cases to reference – it ignored the date – listed only 2021 cases, and even made up the name of a judge! ediscoverytoday.com/2023/01/02/ai-top-cases-of-2022-doesnt-include-any-cases-from-2022-artificial-intelligence-trends/
In response to these sorts of user experiences, OpenAI recently sent out a tweet with warnings noting that ChatGPT is useful for general information in subject areas such as language, science, engineering, finance, history, culture; and less suitable for high context or niche areas such as legal advice, and real time events. twitter.com/openaicommunity.
Can ChatGPT be used for discovery review?
Artificial Intelligence models based natural language processing have been deployed extensively in eDiscovery for some time. Foremost among these approaches is Technology Assisted Review (TAR)[1] which uses algorithms to identify and highlight relevant information based on input from subject matter experts. This technique helps reduce attorney review time and thereby creating time and cost and workflow efficiencies.
Since TAR and generative A.I. are both based on the natural language processing branch of artificial intelligence (Figure 1), one might assume that ChatGPT’s ability to generate human-like information about a broad and complex range of data sets could be easily applied to eDiscovery to enhance eDiscovery review methods such as TAR. Indeed, in the second introductory paragraph above, ChatGPT generated text that describes common eDiscovery tasks that artificial intelligence software can perform with the proper conditions. But it also wrote that it, ChatGPT, could do these types of tasks. While it is true that ChatGPT can perform these tasks based on information it has been trained on, it was not designed to perform eDiscovery tasks, and OpenAI has not developed a version of the GPT technology that can be utilized for eDiscovery. Furthermore, even if the underlying GPT-3.5 model could be developed for an eDiscovery environment, the immense computing resources it currently requires, designed for vast amounts of data, would make it non-scalable and cost-prohibitive. law.com/legaltechnews/2023/01/25/what-will-eDiscovery-lawyers-do-after-chatgpt/
Figure 1.
What can ChatGPT do right now?
ChatGPT has more direct application in terms of workflow and analysis. Discovery in criminal cases increasingly includes both structured (databases, spreadsheets) and unstructured (documents, videos, audio files, phone extractions, social media, emails) data. Currently, most workflows designed to integrate and synthesize these heterogenous formats are necessarily cumbersome, requiring a patchwork of approaches. Many easily available open source tools (e.g. Openrefine, referenced below) or applications such as Microsoft Excel which can be helpful to practitioners are under-utilized, if leveraged at all. ChatGPT has the potential to help bridge the gap between the utility of these applications and practitioners’ ability utilize them.
For example, below (Figure 2) is a screenshot showing ChatGPT’s response to a question about importing a CSV file[2] into CaseMap (a fact and case organization and analysis tool – nlsblog.org/2011/10/05/cja-panel-attorney-software-discounts). Note that while ChatGPT is providing helpful feedback, it is not providing specific, practical instructions on how to carry out the importation of the CSV file into a CaseMap database. This is due to the limited information about CaseMap built into the OpenAI model. In the example above, ChatGPT was able to provide a step-by-step guide on how to import a CSV file into CaseMap. However, there are better and more efficient ways to import a CSV file into CaseMap than what ChatGPT prescribed.
Figure 2.
In our second example, (Figure 3) we see how ChatGPT can help us deal with, CSV files containing ‘messy’ data, in this case duplicate rows in a spreadsheet. It provided guidance on how to utilize a tool called Openrefine openrefine.org to ‘clean-up’ the spreadsheet.
Figure 3.
Since Openrefine is a free, open source tool, ChatGPT was able to develop more accurate information than one might expect when dealing with ‘closed’, proprietary tools such CaseMap.
Conclusion
The need to harness software to effectively work our cases will only increase as data complexity continues to ratchet up. ChatGPT can help facilitate the utilization and adoption of open source and business applications in response to these challenges; lowering the bar to access by providing on-demand, human-like support to practitioners. This can help with the ‘trees’ we believe are relevant to our cases; e.g. a subset of files responsive to a search query. This still leaves the ‘forest’; the large tranches of discovery which we load into review platforms such as Eclipse SE and Casepoint, to parse and organize the data. Whether or how the generative AI technology underlying ChatGPT will have impact in this latter arena remains to be seen.
[1] Also known as predictive coding, computer assisted review, or supervised machine learning.
[2] A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record, and usually consists of tabular data from a database. The CSV file format is supported by a wide variety of business applications including MS Excel en.wikipedia.org/wiki/Comma-separated_values
This post is part of an ongoing series of videos on how Microsoft Excel can help CJA practitioners (including attorneys, paralegals, investigators, and mitigation specialists) in their CJA cases.
CJA panel attorneys and federal defenders frequently receive some of their discovery in spreadsheet or Excel format. Call detail records and indices listing information regarding discovery productions (often called document indexes) are two examples. Having the files in Excel format instead of PDF is advantageous as spreadsheets are designed to sort and filter information, either by a single or multiple criteria. With voluminous information, this ability to sort and filter by multiple criteria can speed up review and allow you to identify the information you are interested in. For example, if you have telephone call records in Excel format, it is easy to filter by a number of phone numbers and quickly narrow the entries to review with several clicks. If the same information is in PDF format, it would have to be done manual and take much, much more time.
For those experienced with Excel, it is a welcome sight to see the data in Excel format. However, for the neophyte Excel user who only reviews PDF files, it can be frustrating to navigate Excel and review the data. This post provides quick and easy formatting options that are available within Excel that can save you review time.
Let’s look at a mock discovery document index as an example. Often the document index starts as a simple list of files with some basic information. As these lists become longer and include more document details (frequently in the form of additional columns), they can be hard to read and work with unless formats are applied. The video below demonstrates quick and simple options within Excel that can help to transform a basic list into a better looking, more functional table that include easy to use sort and filter features.
Video:
Some of the topics covered in the video above are:
Column Width It often helps to be able to adjust the width of columns to better fit important information on your screen. To adjust the width of a column:
Move the cursor in between the column headers until it becomes a black line with two arrows.
To manually adjust the width of a column left-click then drag the black line to the right or left.
To automatically adjust a column, double left click and the column width will become as wide as the longest text entry in that column.
Multiple columns can be adjusted at the same time by selecting them before making a manual or automatic adjustment.
Wrap Text Automatically resizing row heights helps to make the words within longer text cells visible. Select the cells, rows or columns to be adjusted then choose the “Wrap Text” button from the Home menu.
Cell Alignment Adjusting cell alignments can sometime help make items more uniform and easier to view. By default, Microsoft Excel aligns numbers to the “Bottom-Right” of cells and text to the “Bottom-Left”. A common adjustment is to change cells that are “Bottom” aligned to become “Top” aligned, as that is generally easier to read. To do this: select the cells, rows or columns to be adjusted then choose the “Top Align” button from the Home menu. “Right” aligned cells can be adjusted to “Left” alignment through a similar process.
Freeze Panes Selecting certain columns and rows to always be visible greatly increases the readability of longer lists. To freeze panes:
Left click on the cell to the right of, and below the rows and columns you wish to always be visible.
From the View menu, click on the “Freeze Panes” button and select the “Freeze Panes” option.
Data Filter Data filtering is a powerful formatting option. It unlocks the ability to easily sort, filter and search within columns. To turn on filtering:
Select all of the data including any column names.
From the Data menu, select the “Filter” button.
Once data filtering has been enabled, items can be sorted, searched and filtered on by choosing the filter button from the columns.
Format as Table Alternatively, data can be filtered by selecting a “Format as Table” style. “Table” styles are a quick way to make the data visually pleasing and they automatically include the data filtering feature. To turn a list into a “Table” style:
Select all of data including any column names.
From the Home menu, click on the “Format as Table” button and select your desired style (I like the “Medium” styles personally).
From the “Create Table” dialog box select the “My Table has headers” option then click the “OK” button.
Note: Be aware that these format changes are modifications to the original Excel file. If preserving an original copy of the file is important make sure to choose the “Save As” option when saving changes.
Preparing exhibits for trial or court hearings, though not glamorous, is an essential task in the practice of courtroom litigation. Depending on the volume and type of exhibits, this necessary task can quickly turn tedious if you must print each exhibit, affix a physical sticker, fill out the exhibit and case information by hand, then scan and submit the stickered exhibit. In the heat of trial where last minute changes take place frequently, it is easy to make mistakes. However, with the right type of technology, such as Adobe Acrobat Pro (or Standard), this process can be done more smoothly, help reduce opportunities for making errors, and done more quickly than the old school method of stickers and paper If you have Adobe Acrobat*, we suggest considering using digital (electronic) exhibit stickers for your next case.
*Acrobat Standard or Pro, not the free “Reader” version.
This post will walk you through how you can create digital exhibits on your own, including the process of installing a sticker that takes the form of a custom Acrobat stamp. The stamp will allow you to quickly fill in the exhibit and case numbers for your case, and will automatically remember your previous entries the next time you use it.
First, follow the instructions below to install the electronic exhibit sticker.
Installation
Download and copy the exhibit_stickers.pdf file to a location that is easily accessible, such as your Desktop. (NOTE: You can delete this PDF file once we are finished with the installation.)
Open Acrobat and press CTRL-K to open the Preferences menu. Scroll down on the left to “Security (Enhanced)”. Click the “Add File” button, which will open a file explorer window.
Type %appdata% into the address bar and press enter.
This will open a new folder. Open the “Adobe” folder, then the “Acrobat” folder. You may see folders for the different versions that have been installed like a “2017”, “2020” or a “DC” folder. Open the “DC” folder if you have that, or else the highest folder year you have. Open the “Stamps” folder. Find the “exhibit_stamps.pdf” file you saved and drag or copy and paste it into the Stamps folder. Select the file and click “Open.”
This will take you back to the Preferences screen. Verify that exhibit_stamps.pdf is listed inside the box. If the file is there, click “OK”. Then close out of all Acrobat windows.
Usage
Open the PDF that needs an exhibit sticker. Select the “Comment” tool from the list along the right side of the screen.
This will open a new toolbar. Click on the Stamp tool icon, navigate to the “Exhibit Sticker” menu, then click on the Exhibit sticker image.
The first time you use the sticker, it will pop up this window. Check “Don’t show again” and click “Complete.” There is no need to enter any information.
Your cursor will now become a floating exhibit sticker. Click where you would like to place the sticker. Do not worry if the initial placement is not perfect; you can move the sticker to a different part of the page and even resize the sticker after you have placed it.
When you click to place the stamp, a window will pop up asking you to enter an Exhibit Number. Enter the Exhibit number in the box and press OK.
Next, a window will pop up asking you for a Case number. Enter the Case number and press OK.
This will place an exhibit sticker on your PDF that contains the Exhibit Number and Case Number. You can move and resize the sticker if needed. If you need remove or change any of the information on the sticker, you can right click on the sticker, select “Delete” and create a new sticker.
To permanently affix the sticker to the document, you will need to print the document to a new PDF. Go to the File menu and select Print. Now change your printer to “Adobe PDF”, change the “Comments & Forms” selection to “Document and Stamps”, then press print and save your new copy to the location of your choosing.
That’s it. You will now have a permanently stamped PDF document. The next time you want to stamp a document, Acrobat will pre-fill your last enter Exhibit Number and Case Number, so it will be easier to keep track of your exhibits if you are marking multiple documents in one sitting, and you will not have to re-enter the case number each time.
If you need any assistance with installation, you can contact me at carl_adams@fd.org.
Technologies that allow for easier review of ediscovery in native format have become more affordable and accessible. Working with files in native format has several advantages including avoiding loss of potentially relevant information, access to metadata and better searchability. Email is one of the most common of the native formats produced in discovery. This article will explore some approaches for processing email and identify a number of low-cost of tools that can assist. (This article deals with the processing but not the substantive review of emails for case analysis – for this you should consider other tools such as CaseMap, or – for larger collections of emails – review platforms such as Casepoint or IPRO.)
The tools and approaches you select will depend on a combination of three factors: (1) volume, (2) format(s) and (3) the defense team goals. While a single tool might facilitate a discreet goal, more involved goals may require different approaches with a combination of tools. These scenarios can be ends in themselves or phases in an overall workflow. This article does not try to anticipate every possible situation that might arise but will explore a few common scenarios.
Many electronic file formats produced in the course discovery like Acrobat, Excel and Word files are generally accessible via standard software available on most computers. However, email file formats like MSG, EML, PST, and MBOX files present more of a challenge as often the recipient may not know how to access these files.
Below is a quick overview of some of the most common email file formats encountered in eDiscovery that will be discussed in this article:
MSG: A Microsoft format for single emails. Often associated with the Microsoft Outlook email client.
PST: A Microsoft format for a collection of emails (as well as other potential items including: Calendars, Contacts, Notes and Tasks). Often associated with the Microsoft Outlook email client.
EML: Email format for single emails used by many email clients including Novell GroupWise, Lotus notes, Windows Mail, Mozilla Thunderbird, and Postbox.
MBOX: Email format for a collection of emails (as well as other potential items including: Calendars, Contacts, Notes and Tasks) used by many email clients including Novell GroupWise, Lotus Notes, Windows Mail, Mozilla Thunderbird, and Postbox.
All four formats are typically received in discovery and subpoena returns. Google Takeout, a service offered by Google which allows you to download your email, will produce emails in the MBOX format.
Working with these email formats consists of understanding which tool is compatible with which file format, and which tool or set of tools will most effectively allow you to achieve your goals. Below is a table that maps out some of various tools available in terms of which file formats they are able to process, their functionality and cost. Before using any of these tools, make sure to work with a copy of the data as opposed to the original.
1. Generating a list of emails for review. An initial task at the outset of a case might be to generate an index to facilitate early case assessment. Some programs, like PstViewer Pro, will work with many formats while other programs, like Mbox Viewer, work with a more limited number of formats.
Example 1 – Generating a list using Mbox Viewer: Mbox Viewer is a free tool that allows you to preview emails and generate a list of emails by simply selecting messages in the viewer, doing a right click and selecting print to CSV, then selecting which fields you would like to include in the spreadsheet (Figure 1-1).
Figure 1-1
The resulting CSV file contains a table that can be opened in Excel or imported into other programs (Figure 1-2).
Figure 1-2
2. Viewing emails. While a list will provide you with a high-level overview of the emails you have in terms of subject matter, players involved and so forth, a closer review will require a different approach. MS Outlook, Mbox Viewer and Mozilla Thunderbird are all tools which can be utilized for this purpose.
Example 2.1 – Viewing emails received in PST format using MS Outlook: Within Outlook open the ‘File’ menu, select the ‘Open & Export’ button, then ‘Open Outlook Data File’. Navigate to the folder containing the PST file (Figure 2-1) and select the file to import. Outlook will create a folder within the ‘Personal folders’ from where you can conduct a review of the files.
Figure 2-1
Example 2.2 – Viewing emails received in MBOX format using Mozilla Thunderbird with the Import Export Tools add-on: The free ‘Import Export Tools’ add-on available for Mozilla Thunderbird allows for the import and viewing of MBOX files. After the add-on has been installed, right click on ‘local folders’, then choose ‘Import mbox file’ from the ‘ImportExportTools NG’ menu and navigate to the folder containing the MBOX file (Figure 2-2). This will copy the MBOX file into Thunderbird’s ‘Local Folders’ where, similar to Outlook, you can conduct a review of the emails within.
Figure 2-2
3. Search, tag, and convert emails The approaches discussed in the two previous sections can be useful when you simply want to gain a high-level view of the emails, or take a closer look at particular emails in a smaller collection. However, when you are working with large volumes of emails, manual review becomes impractical and inefficient, and taking advantage of the search and tag functionality of the available tools is a better approach.
Example 3 – Searching, tagging and exporting within MS Outlook: Outlook can be utilized to conduct key word searches, and relevant files can be tagged exported as either MSG or PDF files (using the Acrobat integration that is included with licensed copies of Acrobat Standard and Pro). To tag an email, right click and select ‘Categories’ then select a color coded tag (Figure 3-1). You can also customize the tags using the ‘New Category’ option within the ‘Category’ dialog box (Figure 3-2).
Figure 3-1Figure 3-2
You can then filter and tag a selection of emails (Figure 3-3) and save them to a folder as either individual MSG files or a new PST file. If you have a licensed version of Adobe Acrobat, there integration menu within Outlook can be used to convert messages into individual PDF’s or a combined ‘PDF Portfolio’ (Figure 3-4).
Figure 3-3Figure 3-4
When choosing an export format, be aware of the limitations of the different conversion formats. The HTML and PDF export formats typically will not include the complete email metadata. Email header information that may include important information like IP addresses used may be lost during conversion. Export formats including the MSG, EML, MBOX and PST retain much more of the original email metadata.
4. Working with email attachments. Emails invariably have attachments, which, in addition to the body of the email can contain substantive relevant information. The programs discussed in this post vary greatly with how attachments are handled during format conversion. Be aware that some of the programs are not able to include the attachments when exporting to PDF. While PDFs are generally easier to add bates stamps to or turn into exhibits not all programs include the attachments..
Example 4.1 – Exporting email with attachments using Mozilla Thunderbird with the Import Export Tools add-on: Thunderbird offers several export options including the ability to batch export relevant emails when using the Import Export Tools add-on. It does not have the ability to embed or append attachments when exporting messages to PDF, however it does allow for emails to be exported to the EML format (with attachments embedded) as well as an HTML format, which will include links to exported copies of the attachments (Figure 4-1).
Figure 4-1
Example 4.2 – Exporting email with attachments using PSTViewer Pro: PSTViewer Pro is yet another option for format conversion, and is a great tool to use in conjunction with tools like Thunderbird or Outlook. It can convert to many formats and includes some advanced PDF conversion options. When converting to PDF, attachments can either be embedded or “imaged” (Figure 4-2). The “imaged” option will convert supported attachments into PDF pages and appended them to the PDF version of the email (Figure 4-3).
Figure 4-2Figure 4-3
Conclusion
As shown in this article there are a multiplicity of tools available to work with emails that are not universally compatible with all email formats and do not have the same functionality. This requires careful thought about how to leverage and integrate the tools. The best path forward through this thicket is to know what your goals are before you select your tool. Defining your goal early will help you select which tool or combination of tools you should use to develop an effective workflow that matches both the set of data you are working with and the needs of your case.
[Editor’s Note: John C. Ellis, Jr. is a National Coordinating Discovery Attorney for the Administrative Office of the U.S. Courts, Defender Services Office. In this capacity, he provides litigation support and e-discovery assistance on complex criminal cases to defense teams around the country. Before entering private practice, Mr. Ellis spent 13 years as a trial attorney and supervisory attorney with Federal Defenders of San Diego, Inc. He also serves as a digital forensic consultant and expert.]
Introduction
This is an updated version of a post originally published in December 2020, which provides a primer on how Google collects location data, the three-step warrant process used by law enforcement to obtain these records, and an example of how the data is collected and used by the prosecution. The updated version includes references to United States v. Chatrie, a recently decided district court opinion regarding the constitutionality of geofence warrants.[i] From the opinion and the pleadings in Chatrie, we have a better understanding of the Google collection and geolocation search warrant process.
What Can Google Do?
Google began collecting location data in order to provide location-based advertisements to its’ users. Google tracks location data from the users of its products, including from consumers who use Android telephones and those who use Google’s vast array of available apps on other devices such as Apple iPhones. For Android devices, Google is constantly tracking devices whenever the permission settings on the device are set to allow for the use of Google Location Accuracy. For iOS users, location information is only collected when a user is using a Google product, such as Google Maps.[ii] Google stores this information in a repository called “Sensorvault”, which “assigns each device a unique device ID…and receives and stores all location history data in the Sensorvault to be used in ads marketing.” 3:19-cr-00130-MHL at 7. The use of Sensorvault has been very profitable for Google. Since Google started collecting data and using Sensorvault in 2009, Google’s advertisement revenue has almost increased tenfold.
Google is able to determine the approximate location of a mobile device based on GPS chips in the device, as well as the device’s proximity to Wi-Fi hotspots, Bluetooth beacons, and cell sites.[iii] For purposes of Wi-Fi, Google uses the characteristics of wireless access points within range of the device (including received signal strength) to determine the device’s proximity to the access point, and thus approximate location. How Google tracks this data is dependent of the type of device (Android v. Apple) and an individual user’s privacy settings.[iv] Google cannot determine the exact location of a device, and as such, location records contain an “uncertainty value” which is expressed in meters.
Maps Display Radius:
Because Google does not know a device’s precise location, it represents the possible location in a sphere, or what Google refers to as the Maps Display Radius.
In this picture, Google’s “goal is that there will be an estimated 68% chance that the user is actually within” the spherical representation.[v]
To see how Google determines the approximate location of a mobile device, viewing the Location History of a Google account is instructive. In the following example, according to Google, the blue line indicates the path of travel, the orange dots represent wireless access points, and the grey sphere next to the blue arrow is the estimated range of the location source.
Generally, the location information source has the largest impact on the Maps Display Radius. Most often, GPS provides the smallest sphere whereas Cell Sites are generally the largest. By way of example, the map display radius for GPS is often a few meters whereas Wi-Fi is routinely over 1000 meters.
Use of Google’s Tools by Law Enforcement – Three-Step Warrant Process
Although the original intent of Google’s Sensorvault technology was to sell advertising more effectively, over the past few years this data has been sought by law enforcement to determine who was present in a specific geographical area at a particular time, for example, when a crime was committed. These warrants are often called “Geofence warrants” because officers seek information about devices contained within a geographic area. In 2021, Google released information about the number of geofence warrants sought by law enforcement. According to the data, “Google received 982 geofence warrants in 2018, 8,396 in 2019 and 11,554 in 2020.”[vi]
In current practice, Google requires law enforcement to obtain a single search warrant. The three stage warrant process is based on an agreement between Google and the Department of Justice’s Computer Crime and Intellectual Property Section (CCIPS). Once Google receives a geofence warrant, it takes on the extrajudicial role of determining when law enforcement officers have complied with probable cause such that additional information will be provided.
Stage One:
In response to the warrant, “Google must ‘search … all [Location History] data to identify users’ whose devices were present within the geofence during the defined timeframe” and to provide a de-identified list of such users. Chatrie at 19. The list includes: (1) anonymized user identifiers; (2) date and time the device was in the geofence; (3) approximate latitude and longitude of the device; (4) the maps display radius; and (5) the source of the location data.[vii]
Stage Two:
After reviewing the initial list, law enforcement can return to Google and request additional information about any device that is within in the first geofence. This includes “compel[ling] Google to provide additional…location coordinates beyond the time and geographic scope of the original request.” Chatrie at 21.[viii] Troubling, Google imposes “no geographical limits” for Stage Two review. Id.
Stage Three:
The third step involves compelling Google “to provide account-identifying information for the device numbers in the production that the government determines are relevant to the investigation. In response, Google provides account subscriber information such as the email address associated with the account and the name entered by the user on the account.”[ix]
It is important to note that in practice it appears that law enforcement routinely skips Stage Two and moves directly from Stage One to Stage Three analysis.
Past Examples
The shape of Google Geofence warrants has changed over time. For instance, In the Matter of the Search of information that is stored at premises controlled by Google, 1600 Amphitheatre Parkway, Mountain View, California 94043, law enforcement officers investigating a bank robbery sought information about “all Google accounts” located within a 30 meters radius around 43.110877, -88.337330 on October 13, 2018, from 8:50 a.m. to 9:20 a.m. CST.
Compare that to In the Matter of the Search of Information Regarding Accounts Associated with Certain Location and Date Information, Maintained on Computer Servers Controlled by Google, Inc.. In that instance, law enforcement was investigating a series of bombings and they sought location information for “all Google accounts” for a 12-hour period between March 1 and 2, 2018 in a “[g]eographical box” around 1112 Haverford Drive, Austin, Texas, 78753 containing the following coordinates: (1) 30.405511, -97.650988; (2) 30.407107, -97.649445; (3) 30.405590, -97.646322; and (4) 30.404329, -97.647983.
More recently, Google has requested that law enforcement submit Geofence warrants that are convex polygons in shape.
Starting from the Beginning – How the Process Works
To put this into perspective, the following example is illustrative. For these purposes, a crime occurred in the parking lot of a strip mall.
Because the crime occurred in the middle of a parking lot, we will create a geofence that includes storefronts because it will increase the chances that the suspect’s mobile device will be within range of a Wi-Fi hotspot or Bluetooth beacon. Conversely, the geofence will include the mobile devices of numerous people who are not connected to the offense.
The above geofence appears to only impact people who are present in the parking lot or surrounding business. However, the geofence would likely capture many more people, including people living or visiting in the nearby apartments and anyone who was driving on the surrounding streets during the time in question.
Stage One—The following is an example of a Stage One warrant return:
Device ID
Date
Time
Latitude
Longitude
Source
Maps Display Radius (m)
123456789
12/20/20
15:08:45(-8:00)
32.752667
-117.2168
GPS
5
987654321
12/20/20
15:08:55(-8:00)
32.751569
-117.216647
Wi-Fi
25
147852369
12/20/20
15:08:58(-8:00)
32.752022
-117.216369
Cell
1000
123456789
12/20/20
15:09:47(-8:00)
32.752025
-117.216369
Cell
800
987654321
12/20/20
15:09:55(-8:00)
32.752023
-117.216379
Wi-Fi
15
123456789
12/20/20
15:10:03(-8:00)
32.752067
-117.216368
Wi-Fi
25
987654321
12/20/20
15:10:45(-8:00)
32.752020
-117.216359
Cell
450
987654321
12/20/20
15:10:55(-8:00)
32.752032
117.216349
Wi-Fi
40
123456789
12/20/20
15:10:58(-8:00)
32.752012
117.216379
Cell
300
Here, Device ID 123456789 is Suspect One, Device ID 987654321 is Suspect Two, and Device ID 147852369 is Suspect Three. For this example, only one location for each device is shown.
At first blush, it would appear as if the Geofence has located three possible suspects. But this image does not tell the full story. The blue bubbles for Suspect One and Suspect Two show a Maps Display Radius of 5 and 25 meters respectfully.
Suspect Three’s location was derived from a Cell Site, with a Maps Display Radius of 1000 meters.
Thus, although Google believes that Suspect Three’s device was near the scene of the crime, it is possible it was located anywhere within the larger sphere, and it is possible that the device was not located within either sphere.
Stage Two—For this stage, we can expand our original results, as long as we only include one of the accounts returned in Stage One. Here, we will expand our results and determine if Suspect One’s device also present in the area Northeast of the original search location.
Stage Three—is the step whereby subscriber information about the accounts Google deems responsive. Meaning, law enforcement requests Google to provide the account number and information for Device IDs provided in either Stage One or Two. The following is an example of such a return:
Conclusion As technology and privacy concerns of consumers continue to change, so will the ability for law enforcement to obtain location data of users. The use of Google geofence warrants implicates a number of Fourth Amendment issues; future posts will explore the legal implications surrounding the overbreadth of these warrants.[x] But beyond the legal challenges, those encountering Google location warrants should remain mindful of the limitations of the data as well as the absence of concrete answers from Google regarding their methodology for determining location data
[i]See United States v. Chatrie, 3:19-cr-00130-MHL, Docket Entry 220.
[ii] The exception is for a user who has turned location services to always on, has a Google product open on a device, and has allowed for background app refresh. That means that is likely that Google knows far more about the location history of android users than iPhone users. That’s important because approximately 52 percent of devices on mobile networks are iOS devices. https://www.statista.com/statistics/266572/market-share-held-by-smartphone-platforms-in-the-united-states/.
[iii]https://policies.google.com/technologies/location-data (“On most Android devices, Google, as the network location provider, provides a location service called Google Location Services (GLS), known in Android 9 and above as Google Location Accuracy. This service aims to provide a more accurate device location and generally improve location accuracy. Most mobile phones are equipped with GPS, which uses signals from satellites to determine a device’s location – however, with Google Location Services, additional information from nearby Wi-Fi, mobile networks, and device sensors can be collected to determine your device’s location. It does this by periodically collecting location data from your device and using it in an anonymous way to improve location accuracy.”)
[v]See United States v. Chartrie, 19cr00130-MHL (EDVA 2020), ECF 1009 [Declaration of Marlo McGriff] (“A value of 100 meters, for example, reflects Google’s estimation that the user is likely located within a 100-meter radius of the saved coordinates based on a goal to generate a location radius that accurately captures roughly 68% of users. In other words, if a user opens Google Maps and looks at the blue dot indicating Google’s estimate of his or her location, Google’s goal is that there will be an estimated 68% chance that the user is actually within the shaded circle surrounding that blue dot.”)
[vii] Id. at 4 (“After that search is completed, LIS assembles the stored LH records responsive to the request without any account-identifying information. This deidentified ‘production version’ of the data includes a device number, the latitude/longitude coordinates and timestamp of the stored LH information, the map’s display radius, and the source of the stored LH information (that is, whether the location was generated via Wi-Fi, GPS, or a cell tower)”).
[Editor’s Note: Tom O’Connor is an attorney, educator, and well respected e-discovery and legal technology thought leader. A frequent lecturer on the subject of legal technology, Tom has been on the faculty of numerous national CLE providers and has taught college level courses on legal technology. He has also written three books on legal technology and worked as a consultant or expert on computer forensics and electronic discovery in some of the most challenging, front page cases in the U.S. Tom is the Director of the Gulf Coast Legal Technology Center in New Orleans, LA ]
If you were practicing in federal court before email, ECF filing, and in the days when Joe Montana threw to Jerry Rice then you probably remember discovery productions were typically hardcopy documents you picked up at the US Attorney’s Office. The volume was so small it easily fit into your briefcase. Those were the days when everyone complained about not getting enough discovery. The challenge was moving to compel for more discovery when you didn’t know what you didn’t have.
Fast forward to the present. Tom Brady is throwing to Rob Gronkowski (again but in a different city) and discovery is typically so voluminous it cannot be provided in hardcopy form. Productions can be hundreds of gigabytes and sometimes dozens of terabytes full of investigative reports, search warrant pleadings, surveillance audio and video, cell phone data, cell tower material, years of bank records, GPS data, social media materials, and forensic images of servers, desktop computers, and mobile devices. Common are duplicate folders of discovery produced “in the abundance of caution” to protect the Government against Brady violations. Despite the volume, the same issue exists: How do you know what you don’t have?
US v Morgan (Western District of New York, 1:18-CR-00108 EAW, decided Oct 8, 2020) is an example of diligent defense counsel challenging the government on how it produced terabytes of data.
Defendants Robert Morgan, Frank Giacobbe, Todd Morgan, and Michael Tremiti were accused by way of a 114-count Superseding Indictment of running an illegal financial scheme spanning over a decade. The government alleged they defrauded financial institutions and government sponsored enterprises Freddie Mac and Fannie Mae in connection with the financing of multi-family residential apartment properties that they owned or managed. There were also allegations of related insurance fraud schemes against several of the defendants.
The government made several productions which the defense contended were deficient (including the lack of metadata on numerous documents) and, in several cases, omitted key pieces of evidence. The defense enlisted the help of e-Discovery experts, who stated the government failed to properly process and load evidence into their database for production to defense counsel.
The issue was brought before the court in defense motions to compel and dismiss. First to the magistrate judge then to the district court judge, which resulted in a critical analysis of the way the government handled the discovery.
CASE TIMELINE
The original status conference in the case was held on May 29, 2019. For the next year, a series of motions and hearings proceeded with regards to delays and failures on the part of the government to meet discovery deadlines imposed by the court.
An evidentiary hearing was finally held before district court Judge Elizabeth A. Wolford on July 14, 2020, continuing through the remainder of that week until July 17, 2020, and then resumed and concluded on July 22, 2020. There were multiple expert witnesses, and the review of that testimony is over 7 pages in the Opinion.
On September 10, 2020, oral argument on the motions to compel and dismiss was heard before Judge Wolford. The Court entered its Decision and Order on October 8, 2020.
There was no dispute that the discovery in this matter was not handled properly. In the second paragraph of the above cited Decision and Order, Judge Elizabeth A. Wolford states,
“The Court recognizes at the outset that the government has mishandled discovery in this case—that fact is self-evident and cannot be reasonably disputed. It is not clear whether the government’s missteps are due to insufficient resources dedicated to the case, a lack of experience or expertise, an apathetic approach to the prosecution of this case, or perhaps a combination of all of the above.”
Specifically, the government somehow failed to process and/or produce ESI from several devices seized pursuant to a search warrant executed in May 2018 and in one case, a cell phone, seems to have actually been lost. The court ultimately dismissed the case without prejudice. This gave the parties time to resolve the discovery issues. On March 4, 2021, a grand jury returned a new 104 count indictment.
More important for our purposes are the discussions regarding the ESI and production issues. They are outlined below.
PROJECT MANAGEMENT
The Court wasted no time in saying “It is evident that the government has demonstrated a disturbing inability to manage the massive discovery in this case, and despite repeated admonitions from both this Court and the Magistrate Judge, the government’s lackadaisical approach has manifested itself in repeated missed deadlines.”
And later, “To be clear, the Court does not believe the record supports a finding that any party acted in bad faith. Rather, the discovery in this case was significant, and the government failed to effectively manage that discovery. In the end, because of its own negligence, the government did not meet the discovery deadline set by the Magistrate Judge.”
COMPLEXITY OF LARGE AMOUNTS OF ESI
Judge Wolford made several references to the “massive discovery.” In an attempt to manage that data, the Magistrate Judge had initially directed the parties to draw up a document entitled “Data Delivery Standards” (hereinafter referred to as “the DPP”) which would control how documents were exchanged. It failed to do so for several reasons.
First was the large amount of data. Testimony by a defense expert witness at the evidentiary hearing of July 14, 2020, stated that “… the government’s Initial Production consisted of 1,450,837 documents, reflecting 882,841 emails and 567,996 other documents. Of those documents, 860,522 were missing DATE metadata, with over 430,000 documents reflecting no change in the DATE metadata field formatting after the DPP was agreed-upon. Once overlays were provided by the government, the DATE metadata field was corrected for almost one-third of the documents (primarily emails), but 590,448 documents still were missing DATE metadata, including 294,818 emails. Of those 294,818 emails, 169,287 had a misformatted DATE value and 125,531 had no DATE value. The Initial Production also contained missing values for the metadata fields of FILE EXTENSION, MD5 HASH, PATH, CUSTODIAN, MIME TYPE, and FILE SIZE— and the government overlays did not change the status of the information in any of those fields.”
Additionally, the USAO-WDNY’s processing tool was Nuix while another entity—the Litigation Technology Support Center in Columbia, South Carolina – processed some of the hard drives using a different processing tool called Venio. Additionally, the Federal Housing Finance Agency (“FHFA”) processed the Laptop Production using a “much more robust” version of Nuix than the system possessed by the USAO-WDNY.
These differing versions led to different productions which had different values for the metadata fields. Standardization on one tool could have prevented much of this. But the Court also noted that “… the quality review conducted by the government was insufficient to catch these errors.”
Inconsistent directions were an ongoing issue. For example, the Court found that “… the government prosecutors expressly instructed Mr. Bowman not to produce CUSTODIAN information for the Laptop Production, even though the government had provided similar information previously.”
Other government errors included:
It applied different processing software inconsistently to the PST or OST files, thereby missing some metadata and producing varying results.
It misformatted the DATE metadata caused by failing to catch the errors while conducting a quality review.
It failed to produce native files in “the format in which they are ordinarily used and maintained during the normal course of business[.]” It produced near native or derivative native files from the OST or PST files without corresponding metadata.
In many instances, load files necessary to install the document productions in the defense review software platform were missing.
There were ongoing errors with respect to CUSTODIAN metadata, which were the result of human error on the part of the government.
WHAT DOES THIS MEAN TO YOU?
With regards to what specific steps can be used to take control of cases with large amounts of ESI, the Court mentioned several.
Use an exchange protocol. In civil cases, this document would arise from FRCP Rule 26(f), which mandates a “Meet & Confer” conference of the parties so that they might plan for discovery through the presentation of a specific plan to the Court.
In Morgan, this was the document entitled the DPP. In criminal cases going forward, the new Federal Rule of Criminal Procedure 16.1 will address some of these concerns. Drawn up specifically as a response to deal with the manner and timing of the production of voluminous Electronically Stored Information (ESI) in complex cases, Subsection (a) requires the prosecution and defense counsel to confer “[n]o later than 14 days after the arraignment…to try to agree on a timetable and procedures for pretrial disclosure under Rule 16.1.” Subsection (b) authorizes the parties, separately or together, to “ask the court to determine or modify the time, place, manner or other aspects of disclosure to facilitate preparation for trial.”
Standardize the use of technology. As Judge Wolford said, “In sum, the Court believes that it would have been much more prudent if the government, after reaching agreement with the defense about the DPP, had utilized a competent vendor to process the ESI (and all the previously produced ESI) in the same manner with the same settings and utilizing the same tools.”
Get a data manager. A common saying in IT circles is that “someone needs to own the data.” In this case, where the Government used multiple parties who employed different tools to work with the data, nobody owned the data. This lack of a central manager “… led to electronic productions being produced in an inconsistent manner and, in some instances, in violation of the DPP.”
Get an expert. After hearing multiple experts testify for several days on what had transpired with the ESI, the Court noted, “… electronic discovery is a complicated and very technical subject. As a result, facts can be easily spun in a light most favorable to one party’s position or the other. That occurred here on behalf of all parties.”
Nonetheless, the experts were able to bring clarification to the issues of “missing” metadata and divergent processing results that had beleaguered the parties and the Court. This field, especially with large amounts of ESI, is a classic example of the old maxim, “do not try this at home.” Get an expert.
Use a review tool. ESI in these large amounts are simply not able to be reviewed manually. Both parties here recognized that fact and, as the Court noted several times, most of the errors in the case were not due to software but what we nerds call the “loose nut on the keyboard” syndrome.
Get review software. Get trained on it. Use it. One admonition I always make which could have avoided many delays in this matter is do not try to load everything at once into your review platform. Start with a small amount of sample data to be sure you are getting what you need. Which leads to our last takeaway.
Talk with the government. Judge Wolford specifically noted that the “… the Court also concludes that Defendants and the government were not always communicating effectively regarding electronic discovery.” For example, none of the parties could recall “… any discussions during those negotiations about the processing tools that would be utilized or the type of native file that would be analyzed for purposes of creating a load file.”
CONCLUSION
The Morgan case illustrates there are ways to learn about what you don’t have so you can bring it to the government’s attention and if need be, to the Court. It is also example of a Court being knowledgeable about ESI productions. The Court noted often and in different ways that “… electronic discovery is challenging even under the best of circumstances. In other words, the facts and circumstances cannot be appropriately evaluated without considering the volume of discovery and the enormous efforts needed to manage an electronic production of this nature.”
Find an expert who understands your needs and has effective communication skills to convey to you, the government, and Court complex technical issues. For many years, Magistrate Judge Andrew Peck (SDNY, Retired) advocated “Bring-Your-Geek-To-Court Day,” in which parties bring an outside consultant or an in-house IT person to address disputes. If you were to remember only one thing form this case, it should be: Go get a geek.
[Editor’s Note: John C. Ellis, Jr. is a National Coordinating Discovery Attorney for the Administrative Office of the U.S. Courts, Defender Services Office. In this capacity, he provides litigation support and e-discovery assistance on complex criminal cases to defense teams around the country. Before entering private practice, Mr. Ellis spent 13 years as a trial attorney and supervisory attorney with Federal Defenders of San Diego, Inc. He also serves as a digital forensic consultant and expert.]
For many years, law enforcement officers have used records generated by mobile carriers to place a mobile device in a general area. The records are called Call Detail Records (“CDRs”). CDRs are generated when a mobile device sends or receives calls and text messages. Mobile carriers likewise keep records of when data is used, such as browsing the internet. These records are called Usage Detail Records (“UDRs”). At times, the records generated by mobile carriers include the location of the cell site or cell sites and the direction of antenna that connected with the mobile device.
Cell Site Location Information (“CSLI”) is the practice of creating maps showing the possible coverage area of a cell site at the time a device was being used. For these purposes, it is important to keep in mind that the records only show the location of the cell site and the direction the antenna is facing. Recent technological improvements have resulted in mobile carriers now generating Enhanced Location Records (“ELRs”), which purport to show more precise location data. In AT&T parlance, such records are based on the Network Event Location System (“NELOS”). This location data is derived from proprietary algorithms.
In a recent federal case, the government, through a member of the Federal Bureau of Investigation’s (“FBI”) Cellular Analysis Survey Team (“CAST”), sought to introduce NELOS records in a trial. However, after a Daubert hearing where the CAST agent testified, the district court excluded the records, in part, because of concerns over the reliability of the algorithms used to determine the location data.
This article provides an overview of CSLI and NELOS records, discusses the order excluding NELOS records from trial, and provides practical advice for practitioners.
Overview:
When CDRs include cell site location data, analysts and law enforcement officers use these records to show the location of the cell site and the orientation of the sector. In North America, many cell towers contain three sets of antennas, with each set offering specific coverage area.
Picture 1
To illustrate this point, Picture 1 is an overview picture of a multi-directional cell tower. Each blue arm is a sector. When a mobile device connects to a cell site, the mobile carrier often records the activity (i.e., a sent text message), the time of the activity, and the location of the cell site and sector that was used.
Using these three data points, analysists and law enforcement officers create maps showing the location of the cell site and the orientation of the sector. In Map 1, the arms are used to demonstrate the beamwidth of the sector, which in this case records indicate is 120-degrees. The cone at the base of the triangle is only meant to show the orientation of the sector, not coverage area. Moreover, analysts generally will not testify that the mobile device was within the triangle. The triangle is only meant to represent the location of the cell site and the orientation of the sector.
Map 1
With NELOS records, on the other hand, the ELRs purport to show the location of a device as opposed to the location of the cell site. In the following example, the red pin represents the location of the device. The blue circle represents what AT&T calls the “Location Accuracy.” This accuracy ranges from approximately several meters to 10,000 meters. And some records are marked by “location accuracy unknown.” As discussed below, the Location Accuracy is determined by proprietary algorithms used by AT&T.
Map 2
In Map 2, the ELR indicates that the “[l]ocation accuracy [is] likely better than 300 meters.” In other words, the phone was at the red pin or within the blue circle at a specific date and time. NELOS records, however, contain the following statement: “The results provided are AT&T’s best estimate of the location of the target phone. Please exercise caution in using these records for investigative purposes, as location data is sourced from various databases, which may cause the location results to be less than exact.” DE 156 at 23 (emphasis added).
To put the first two examples into perspective, Map 3 shows both traditional CSLI and the use of NELOS records.
Map 3
The NELOS demonstrative, even taking account of the “Location Accuracy,” still provides a much smaller, and thus more specific, area of where the phone activity took place.
United States v. Smith, et al. (4:19-CR-514-DPM) (EDAR):
Donald Smith and Samuel Sherman were charged in a five-count indictment with various crimes relating to a murder. See Docket Entry (“DE”) 1. The government sought to introduce the testimony of CAST Agent Mark Sedwick “that provider-based location data typically is collected by obtaining historical call detail records for a particular cellular telephone from the service provider, along with a listing of the cell tower locations for that service provider.” DE 102 at 1. According to the government, “[t]his data is then analyzed for the purpose of generally placing a cellular telephone at or near an approximate location or locations on a map at points in time.” Id.
The government sought to have Agent Sedwick testify “regarding the activity and approximate locations of the cellular telephones believed to have been utilized by Donald Bill Smith, Samuel Sherman, Racheal Cooper and Susan Cooper on the approximate dates and times relevant to the charges in the Indictment.” Id. at 1-2. Attached to the government’s motion is the report created by Agent Sedwick. Maps 4 and 5 are examples from Agent Sedwick’s report. Map 4 shows how Agent Sedwick mapped traditional CSLI, and Map 5 shows how he mapped the same time period using NELOS records:
Map 4Map 5
Map 4 shows traditional CSLI mapping with the location of the cell site and the orientation of the sector. With Map 5, each circle represents the area in which the device was used. Here, there are four such events. For comparison, in Map 4, Agent Sedwick’s opinion is limited to testifying about the location of the cell site and the orientation of the sector, whereas with Map 5, the testimony is the mobile device is within the circle.
Prior to trial, defense counsel challenged Agent Sedwick’s potential testimony and the district court conducted a hearing to determine the admissibility of the records pursuant to Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 US 579 (1993). During the hearing, Agent Sedwick explained the reason AT&T created NELOS was to “test the health of the 3G network for planning and troubleshooting. It is a passive system where, while the phone is on the control channel communicating with the network across the control channel, it would passively pull whatever location data it could pull or data to compute location from that device.” DE 156 at 8.
Agent Sedwick further explained: “NELOS also became the generic term for any kind of location data. So depending, there might be other databases that were also pulled into the NELOS report that we receive from AT&T. Just from that report there’s no way to determine what other databases that was pulled from.” DE 156 at 9.
Agent Sedwick also provided information about known issues with NELOS data, specifically based on Temporary Mobile Subscriber Identity (“TMSI”). By way of background, mobile devices are assigned an International Mobile Subscriber Identity (“IMSI”), a unique number used by mobile carriers, which establishes that the mobile device can operate on a specific network. This is the number used by mobile carriers when creating CDRs. At times, however, in order to mask a device’s actual IMSI, networks assign the device a TMSI.[1] This is problematic for NELOS records because as Agent Sedwick explained, “[t]hat TMSI sometimes can get reallocated and then allocated back to a device, so you can have sometimes where the NELOS data will pull from a different device and get put into the records for the device that you’re requesting.” DE 156 at 10.
During cross-examination, Agent Sedwick was questioned about the portion of NELOS records that “caution in using these records for investigative purposes.” Agent Sedwick responded: “I wouldn’t rely on it if all I had was a NELOS point putting someone at a scene and that’s all I had, no, I would not use it. I’m using it—there is a caution with it, but I’m using it in the context of I have call and text to support it, I have other data to support, I have very good precise NELOS data. I feel very, very confident that this is accurate.” DE 156 at 24.
Agent Sedwick’s confidence in the accuracy of NELOS records was based on the proprietary algorithms created by the phone company. See DE 156 at 12 (“Question: Okay. So the device is sending various different events, they’re plugged into that algorithm, and essentially the algorithm will spit out what it computes as accuracy; is that correct? Answer: Yes, ma’am”). But Agent Sedwick acknowledged that he was not privy to the algorithm, nor whether NELOS was tested by AT&T for reliability. Instead, Agent Sewick testified he believed the algorithms are reliable “[b]ecause AT&T relies on that to make multi-million-dollar decisions on how they’re going to design their network.” DE 156 at 32.
In granting the defense’s motion to exclude NELOS data, the district court found:
What particularly concerns me, though, is this mystery algorithm that our—and the proprietary software. We don’t know, I don’t know exactly what is in the algorithm, and the agent gave some testimony at a general level about the kind of information that goes in, but it seems to me that I’m missing a—an important foundational stone there of something with more specificity as to the kinds of things that the algorithm uses and how the algorithm does its work.
We know that there are disturbances from time to time, or anomalies as was called, with the TMSI number. I also—I acknowledge some uncertainty about TMSI numbers and how many devices that might be connected with and how it is that the algorithm might deal with that. So there’s that. Then there is, in my view, almost a—so we’ve got our black box there, which is concerning, and I would say at this point there’s a peer review problem, as well, because I don’t have any scholarly literature or evaluation of the black boxes or the kind of things that could go into this black box and how it would work.
…
I understand about the corroboration, but I still find myself at sea of understanding how it is the—how things happen in the black box and whether—whether what comes out of the black box is sufficiently reliable that the jury can rely on it.
DE 156 at 85-87 (emphasis added).
Based on this, the district court entered the following order: “Agent Sedwick may testify about call detail records and historical cell-site analysis; but he may not testify about NELOS data and analysis.” DE 154.
Further Consideration:
The district court’s exclusion of NELOS records was based, in part, on the use of data generated by untested algorithms. Other mobile carriers also use ELRs, which generate purported location data that are also based on proprietary algorithms similar to NELOS. In seeking to exclude ELRs, as well as other forms of computer-generated data, counsel should encourage courts to question the reliability of evidence created by algorithms that lack independent validation and verification.
Glossary:
Acronym
Full Title
CAST
Cellular Analysis Survey Team
CDR
Call Detail Records
CSLI
Cell Site Location Information
ELR
Enhanced Location Records
IMSI
International Mobile Subscriber Identity
NELOS
Network Event Location System
TMSI
Temporary Mobile Subscriber Identity
UDR
Usage Detail Records
[1] As explained by EFF, “upon first connecting to a network, the network will ask for your IMSI to identify you, and then will assign you a TMSI … to use while on their network. The purpose of the pseudonymous TMSI is to try and make it difficult for anyone eavesdropping on the network to associate data sent over the network with your phone.” See https://www.eff.org/wp/gotta-catch-em-all-understanding-how-imsi-catchers-exploit-cell-networks.
Whether a federal criminal defense attorney is a sole practitioner, part of a firm or in a Federal Public or Community Defender Office, they are often assigned to a case on their own. In many situations, that is manageable because there is not a lot of information to organize, the client can help to review the discovery produced by the government or the strategy involves a plea. However, as cases continue to grow in size and complexity, it’s helpful to have paralegal assistance. A paralegal can support attorneys in many ways in a case, ranging from assisting with client contact to aiding attorneys at hearings and trial, but it is with discovery management that paralegals are increasingly important in today’s legal world. They can help the defense team get the work done faster and make the overall process more cost effective. A paralegal can contribute when an attorney is trying to understand the scope of the discovery and design a strategy to access and review the files more efficiently, organize everything, and ultimately search and review the discovery and case materials in a meaningful way.
Unique challenges that federal criminal defense practitioners face include increasing numbers of proprietary formats that standard software cannot open, large volumes of information that need to be sifted through and the potential lack of technology resources. All of these challenges make having human resources available even more important.
Fortunately, even sole practitioners need not fly solo. They can have paralegals as permanent members of their team or hire them specifically for a case.
While some paralegals have experience working on particular types of cases and are proficient in using certain software tools, some are new to the field and eager to learn. The type of paralegal that is the best fit for a criminal defense practice or for a case depends on the attorney’s working style, the type and complexity of the discovery involved, the timeline of the case, and the long-term goals to be met by adding a member to the team. Below are some questions an attorney should consider when thinking about hiring a paralegal.
Do I need them to understand how to manage a case as soon as they walk in the door?
What litigation support software am I using that I want them to be familiar with and have experience using?
Do I want someone who knows about programs than can help me better manage discovery (and perhaps know more than I do on the topic)?
Do I need someone experienced with using online document review databases?
Do I need someone who understands how to search large sets of discovery using metadata filters (e.g. date ranges, file types, authors, and recipients, etc.) combined with keywords to help me identify the most relevant documents in the discovery?
Do I need someone experience creating complex Boolean searches for culling large data sets into more manageable sets of discovery to review?
Have they previously worked in my district on federal cases ?
If not, are they willing to learn about the types of cases and the types of discovery generated here and become familiar with the unique nature of my district?
Have they worked on the types of cases to which I am typically appointed?
If you are considering hiring a paralegal for a particular case, it is crucial that they have a suitable skillset for it. They should know how to leverage outside resources to make the overall discovery review process more efficient. For example, they may recommend using a third-party-vendors to process and host e-Discovery. This is reasonable and often times preferred, but they should not be billing time to have others do the work you expect from them.
The National Litigation Support Team (NLST) is a resource not only for CJA panel attorneys, but for private paralegals who assist panel attorneys as well. The NLST can answer questions and provide strategies about best practices when it comes to managing particular formats of discovery, demonstrating how a third party vendor can assist in particular situations, introducing your paralegal to software available to panel members and provide one-on-one training on those tools, so that they can provide you with the best possible support.
Adding a paralegal to your practice, or to your team, for a single case can be the difference between discovery being left unreviewed due to shortage of time or lack of technology and being able to focus on telling your client’s story. Paralegals should be vetted, and you should have a clear understanding of their familiarity and experience with technology, the types of cases they have worked on and their willingness to learn new platforms and new ways of searching, reviewing, and managing information. When you find the paralegal that is a good fit for your practice, they will truly become the linchpin to your team’s discovery review process.
The National Litigation Support Team (NLST) is pleased to announce that IPRO has agreed to provide a discounted rate for CJA panel attorneys to purchase a subscription license of TrialDirector 360.
TrialDirector 360 is a courtroom presentation tool that allows users the ability to present documents, pictures and videos in hearings and trials. Users can prepare exhibits in advance, or instantly display exhibits to jurors and judges. Additionally, attorneys can direct jurors’ attention to the most important parts of exhibits by doing call-outs, zoom-ins, mark-ups, highlights, and side-by-side comparisons of documents. During the examination of a witness, it is easy to do a screen capture of information that has been displayed to the jury for later use in the trial, and the software works well when used along with PowerPoint. TrialDirector has been successfully used for many years by FDOs and CJA panel attorneys representing clients and has been a staple of the Law and Technology workshop training series for close to 20 years.
CJA panel attorneys can purchase TrialDirector 360 at a discounted price of $556.50 per year (approximately 40% off the retail price). This price is for a subscription, so you must pay this amount each year to continue using the software.
If CJA panel attorneys are interested in purchasing TrialDirector 360 contact Kelly Scribner. If you have any questions regarding the utilization of TrialDirector 360 for your office, please contact the National Litigation Support Team (NLST): Kelly Scribner or Alex Roberts.
The NLST will be providing remote one-on-one training on how to use TrialDirector 360 for any user interested. Please have the user contact Kelly Scribner to schedule training.