dtSearch Desktop Demonstration Video

dtSearch is a popular search and retrieval program. Here is a brief 12 minute video that demonstrates how to setup a new dtSearch index and how to run searches within an index.

As mentioned in the dtSearch Desktop post, we have been able to obtain a limited number of licenses that will be made available to CJA panel attorneys with current, active cases.  To request a license go to the dtSearch Desktop post and fill out the request form on the bottom.

Note: like most litigation software programs, this program was developed for Windows-based operating systems and does not work with Macintosh operating systems.

 

dtSearch Desktop

Limited licenses of dtSearch Desktop Available for CJA Panel Attorneys

We are pleased to announce that we have been able to obtain a limited number of dtSearch Desktop software licenses for CJA panel attorneys with current, active cases.

dtSearch is a popular search and retrieval program, and it is the search engine utilized in well known computer programs such as Forensic Tool Kit (FTK, a computer forensic tool), CaseMap and Adobe Acrobat Pro.  This type of program is a useful tool to assist legal teams in searching discovery, creating brief banks, and viewing different file types (including non-PDF files) even if you don’t have the associated application.  We have a limited number of licenses available for CJA panel attorneys to use for free (a $200 value).

The program provides great functionality in searching both electronic documents and paper documents that are subsequently scanned and converted to a text searchable format, especially since it can search and retrieve information in many different file types.  dtSearch is a user friendly software program which provides immediate results and utility for even the novice computer user.  As electronic discovery in federal criminal matters continues to grow in volume and in the variety of formats, dtSearch is a useful tool for CJA panel attorneys faced with the daunting task of organizing and searching through their case material.

To obtain the software, please fill out the dtSearch Request Form below. When finished filling out this form, press the “submit” button on the bottom of the form. This will attach your completed form to an email message sent to Assistant National Litigation Support Administrator Kelly Scribner ( kelly_scribner@fd.org ). You will then receive an email with download instructions and the activation code necessary to obtain your free copy of the dtSearch Desktop. Please allow up to 5 business days to process your request.  Each user license can be installed for that user on two machines.

You must have an active appointed case to continue to utilize the license.  If you are no longer on the panel and don’t have an active appointed case, we request you return the license to the National Litigation Support Team (NLST) by contacting Kelly Scribner so the license can be used by other CJA panel attorneys.  Like most litigation software programs, this program was developed for Windows-based operating systems and does not work with Macintosh operating systems.

For technical support or if you have any questions regarding the utilization of dtSearch within your office, please contact either Alex Roberts or Kelly Scribner (members of the NLST) at 510-637-3500, or by email: alex_roberts@fd.org, kelly_scribner@fd.org.  If you want to learn more about dtSearch, go to http://dtsearch.com/.

dtSearch Desktop Request Form:

Adobe Acrobat: “Renderable Text”

When working with PDF documents you may encounter a “renderable text” error message.  This message will sometimes occur when trying to make a scanned paper PDF file text searchable (also know as adding OCR to a document).

error messageDepending on the version of Acrobat you have, the message may read something like:

“Renderable text” is typically text that has been added to an scanned paper image (like a header, footer or bates number), through a non-Acrobat program.  The way this text is encoded into the page can cause Acrobat to disallow additional searchable text (OCR text).

This message can certainly be annoying and it can also be significant as it can limit your ability to run searches.  In Acrobat, you will be unable to add new searchable OCR text, or improve the quality of the existing OCR, until the error is fixed.

If you’ve seen this message before, and have tried to fix the document without success, you are not alone!  We spoken with a number of people over the years who have come up with some creative solutions.  Though we have yet to find “one solution” that will always fix this particular error, here are a number of possible solutions (results will vary depending on the cause of the error):

Solution 1: Obtain a version of the document with OCR.

  • It may seem simplistic, but if you receive documents without searchable OCR, ask for it.  Often the person or organization that gave it to you will want to search the files themselves and may already have a copy that has been OCR’ed.  Even if the documents they give you generate “renderable text” error messages, you will still be able to search any of the existing OCR text within the files.

Solution 2: If the files are from PACER / ECF, download a new copy.

  • The default download settings in PACER / ECF will add “purple” headers with the case number (which will cause a “renderable text” error message).  If you can find the document again in PACER / ECF, download it with the header option turned off.

Solution 3: Run “Add Tags to Document” (available in Acrobat Pro).
accessibility menu

  • If you have Acrobat Pro installed there is a special “Accessibility” menu where you can run “Add Tags to Document”.  For certain PDF’s, running this option will clear up the issue and allow the document OCR to be run.

Solution 4: Print the document to PDF (available in Acrobat Standard and Acrobat Pro).

  • If you have Acrobat installed (Standard or Pro) you’ll probably also have access to an “Acrobat PDF” virtual printer.  By printing the document to this virtual printer, the new PDF that is created will often avoid having the renderable text issue.

Solution 5: “Sanitize” the document then rerun OCR (available in Acrobat Pro).

  • From the “Protection” menu run “Sanitize Document”.  This will remove all of the document metadata including some of the rendered text that might be causing the error.
  • Re-run the OCR process.

Solution 6: Convert to TIFF files and back, and then re-run OCR (available in Acrobat Standard and Acrobat Pro).

  • Open the PDF document in Acrobat and choose “File > Save As“.
  • In the “Save As” dialog box, choose TIFF (*.tif, *.tiff) from the Save As Type (Windows) or Format (Mac OS) pop-up menu. Specify a location, and then click Save.  Acrobat saves each page of the PDF document as a separate, sequentially numbered TIFF file.
  • Combine the single pages back into a multipage document and re-run the OCR process.

Solution 7: Convert to XPS file format and back, and then re-run OCR.

  • If your computer has the “XPS” virtual printer installed (it comes with many version of MS Office) then print the file using the “Microsoft XPS Document Writer” printer.
    • The XPS printer will ask you to save the file.
    • Convert the saved XPS file to PDF.
    • Re-run the OCR process on the new PDF.

Solution 8: Try running the OCR using a different program.

Adobe Acrobat Training Videos: Searching Fundamentals

Previous video – Text Recognition

Adobe Acrobat Pro is one of the most popular computer software programs on the market for FDO and CJA panel attorneys.  Since so much of the discovery we currently receive in criminal cases is provided in paper or scanned paper format, Acrobat Pro is an excellent tool to help you to better organize and review it.

In our team’s continued efforts to providing resource to CJA panel attorneys and FDO staff, we are creating a series of training videos. Each short video will address a specific feature in a computer software program with our first set focused on Adobe Acrobat Pro XI.

Future videos we are developing will also be posted on this blog.  Make sure to check back in or sign up to subscribe to our blog to get notices of new posts by email.

These videos do not take the place of hands-on training sessions where we can get in depth about a variety of software programs and legal strategies for addressing complex cases, but it hopefully will provide you some basic background information that can help you in your cases.

DeMystifying De-NIST

With ever rising volumes of discovery data, increasingly legal teams are looking for solutions that can assist them manage the amount of data they need to review.  In circumstances where significant amounts of ESI (Electronically Stored Information) and forensic images of hard drives are involved, one common method is to “De-NIST” discovery data sets.  “De-NIST”ing can be a significant time and money saver and an important part of the discovery review process.

So what the heck does “De-NIST” mean?  

NIST is the acronym for the National Institute of Standards and Technology (website www.nsrl.nist.gov).  One of NIST’s projects is the National Software Reference Library.  This project is designed to identify and collect software from various sources and create a Reference Data Set (RDS).  The RDS is a collection of digital signatures of known, traceable software applications. 

A digital signature is like a digital fingerprint (it is also commonly referred to as a hash value).  In theory, every file has a unique hash value.  If two files have the same hash value they are considered duplicates.  

Most software applications are comprised of multiple files.  For example: when Adobe Acrobat Reader is installed there are hundreds of standard files copied to a computer’s hard drive.   All of these standard install files are the same (i.e., they have identical hash values) no matter what computer they reside on.  A typical computer contains hundreds of software applications.  The files associated with running these applications are not user generated and hold little evidentiary value for litigation purposes.  The NIST list is a database that contains over 28 Million of these file signatures.

De-NIST”ing is the process of identifying these files so that a decision can be made if they should be set aside or removed from a discovery database.  The NIST list is compared to the file signatures of the data sets within the discovery.  Any file that has a signature that matches one in the NIST list can be “De-NIST”ed (identified or removed) from the collection. 

While many legal review teams expect the De-NIST process to get rid of every application or system file within a data collection it is important to note that the NIST list does not contain every single system file.  Though it may not remove all of the system files, it can significantly reduce the dataset, especially when working with with copies of hard drive images. 

When presented with an overwhelming river of information, trying to find relevant information can feel like you’re panning for gold.  De-NIST’ing can help to identify or get rid of the much of the water, stones and muck and leave you with a much more manageable pan.   

Posted in ESI

What is a “Load File”?

A “load file” is a special kind of file that you may encounter in sets of case related materials.  While there are many different flavors of load files they all serve the same general purpose: they can be used by litigation support software to import (i.e. “load”) information about case related documents. 

Document information may include:

  • Name and locations of image files (typically scanned paper files).
  • Document unitization information (i.e. document breaks).
  • OCR (searchable text) file names and locations.
  • Electronic document (ESI) file names and locations.
  • Extracted metadata information.
  • Other fielded document information.

Load files can play an import role in assisting with the setup of a case document database.  When properly used, they can make the process of importing documents into litigation support applications faster and more efficient.  Some programs that support the importing of load files include evidence review programs (like Summation, Concordance and IPRO) and trial presentation programs (like TrialDirector and Sanction). 

Load files have different file extensions depending on the program they are designed to work with.  When talking with litigation support vendors, or discussing the format of discovery with opposing counsel.  It is important to recognize which load file formats work with your litigation support programs.

Some common file extensions of load files that you might encounter are:

  • .DII      designed to work with AD Summation
  • .OPT   designed to work with Concordance 
  • .LFP    designed to work with IPRO products
  • .OLL    designed to work with TrialDirector
  • .SDT   designed to work with Sanction
  • .DAT   generic document information load file   
  • .CSV   generic document information load file
  • .XML   new “EDRM” style load file format that works with many platforms 

Many load files contain the path of image files associated with a record.  They may also contain meaningful additional information about the documents.  For scanned documents, this may include a bates or control number, coded document information (like document type, date, title, etc…) and information about OCR (searchable text) files that might be associated with the document.   Load files for electronic documents (ESI) may also include extracted metadata (associated information about the files such as author, date created, file size, etc…).   

Most load files are simple lines of text that can be read by litigation support programs.  When viewed in a text program like Wordpad or MS Word we can see what the lines contain.  Here is an excerpt from a sample .LFP load file (as seen in in Wordpad):

IM,D0022,D,0,@DISK001;DATA\IMAGES00;D0022.TIF;2,0
IM,D0023,D,0,@DISK001;DATA\IMAGES00;D0023.TIF;2,0
IM,D0024, ,0,@DISK001;DATA\IMAGES00;D0024.TIF;2,0
IM,D0025,D,0,@DISK001;DATA\IMAGES00;D0025.TIF;2,0

This particular load file contains information about document images.  Litigation support software programs can read this file and know:

  1. the record identifier (usually the bates number) of a document
  2. where one document ends and another begins
  3. where to find the scanned paper .TIF files associated with a document

There may be times when you will receive multiple types of load files within the same set of documents.  Some of the files may contain the same information, but are designed to work with different database programs.  When working with vendors, let them know what litigation support database programs you intend to use so that they give you compatible load files. 

In the event you receive load files that are not designed for your database program, you may need to convert the file to make it compatible.  Fortunately there are a few free load file conversion programs available.  Two such programs are: 

    1. ReadyConvert from Compiled Services (compiledservices.com)
    2. iConvert+ from IPRO Tech (iprotech.com)

    To find out more about how load files can best be used interact with your existing litigation support applications refer to the help and support documents of the program.  Quite often, these are the best resource for describing how load files interact with the case database and will often demonstrate the load file import process.