Using online tool for searching large database of Word or PDF documents

From TechCampGlobal
Jump to: navigation, search

Group 7

Video Presentation

Project Title:

Using online tool for searching large database of Word or PDF documents

NGO Project Lead:

  • Andrew
  • Thomas

All Project Participants:

  • Jeton, Kosovo
  • Nehat, Kosovo
  • Senko, Croatia
  • Stevica, Croatia
  • Zheljana, Croatia
  • Lidija, Macedonia
  • German, Macedonia
  • Ana, Macedonia
  • Neda, Macedonia

Summary of the Problem

  • Takes too long to search multiple PDF or Word documents
  • Hard to find specific documents, due to large pool of information
  • Hard to categorize the data from documents available online
  • Hard to convert data from PDF documents to other formats, mostly Excel and Word in order to use the data
  • Most available online tools are adapted to English language, and not suitable for use with other languages that have cases
  • Researchers in various disciplines do not necessarily have specialized IT skills to be able to quickly and easily search large databases.

Proposed Solution

This should provide an outline of the basic need that would solve the above problem.

  • Use a low cost online tool for searching large number of documents that do not require specific skills or extensive and expensive training

Technology to Solve the Problem

Possibilities to solve this problem include online tools, but also offline tools.

  1. Online tools:
  • Document Cloud:
  • Google Docs (Google Drive):
  1. Offline tools:
  • Spotlight (Mac only)
  • Doc Fetcher

Existing solutions or relevant links

  1. Online tools:
  1. Offline tools:
  • Spotlight (Mac only)
  • Doc Fetcher
  • Alfresco
  1. Other solutions (searching non-standard text, and organizing multiple documents):

Action Plan

Solution: Directions for using existing tools Offline tools

  1. Install Spotlight, DocFetcher or Alfresco.
  2. Put the documents of interest in a directory on your hard drive.
  3. Open Spotlight, DocFetcher or Alfresco.
  4. Add the directory to Spotlight, DocFetcher or Alfresco, and “index” the directory.
  5. Search.

Online tools

  1. Make an account at Google or DocumentCloud
  2. Upload the documents of interest to Google Drive or DocumentCloud.
  3. Search.

If you are searching scanned documents, use something with optical character recognition, like DocumentCloud (free) or Adobe Professional.

Searching as a non-standard text searching The Overview Project is a searching tool for non-standard text searching. Currently, it is supported for multiple languages, but we are not sure if it is supported for Slavic languages yet. If it Slavic lanagues support is not available, the administrator can be