How to extract countries from a text?

12,812

Solution 1

you could use pycountry for your task (it also works with python 3):

pip install pycountry

import pycountry
text = "United States (New York), United Kingdom (London)"
for country in pycountry.countries:
    if country.name in text:
        print(country.name)

Solution 2

There is a newer version for this library that supports python3 named geograpy3

pip install geograpy3

It allows you to extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.

Example:

import geograpy
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('words')
url = 'http://www.bbc.com/news/world-europe-26919928'
places = geograpy.get_place_context(url=url)

You can find more details under this link:

Share:
12,812
Markus
Author by

Markus

Updated on June 16, 2022

Comments

  • Markus
    Markus almost 2 years

    I use Python 3 (I also have Python 2 installed) and I want to extract countries or cities from a short text. For example, text = "I live in Spain" or text = "United States (New York), United Kingdom (London)".

    The answer for countries:

    1. Spain
    2. [United States, United Kingdom]

    I tried to install geography but I am unable to run pip install geography. I get this error:

    Collecting geography Could not find a version that satisfies the requirement geography (from versions: ) No matching distribution found for geography

    It looks like geography only works with Python 2.

    I also have geopandas, but I don't know how to extract the required info from text using geopandas.