Skip to main content
A blue image of board with circuit diagram
White Paper

Cross-lingual Search Based on Concepts and Meaning

Cross-lingual search is the process of querying in one language to find relevant documents in other languages. Until recently, machine translation has been the primary method of searching across languages either by translating search queries into other languages or translating searchable records into English. However, both machine and human translation lose valuable nuances and meaning present in the original text. 

This white paper explores an approach that delivers better accuracy based on semantics (meaning), not translation. Semantic search goes beyond finding keywords to retrieving ideas suggested by the keywords. 

In part 1, we compare the traditional translation-based approach with a newer approach that uses semantic similarity through text embeddings — a way to represent words in natural language processing tasks that encodes the meaning of words as mathematical vectors.

In part 2, we look at implementing semantic search as we discuss:

  • How to retrofit an existing keyword search engine to add cross-lingual and fuzzy search

  • Ways to overcome issues of speed, especially when searching very large data sets

  • A specific use case — targeted topic and event extraction 

  • The special case of cross-lingual name matching

Download the white paper for free.

Babel Street Home
Trending Searches