How to extract countries from a text?
Solution 1
you could use pycountry for your task (it also works with python 3):
pip install pycountry
import pycountry
text = "United States (New York), United Kingdom (London)"
for country in pycountry.countries:
if country.name in text:
print(country.name)
Solution 2
There is a newer version for this library that supports python3 named geograpy3
pip install geograpy3
It allows you to extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.
Example:
import geograpy
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('words')
url = 'http://www.bbc.com/news/world-europe-26919928'
places = geograpy.get_place_context(url=url)
You can find more details under this link:
Markus
Updated on June 16, 2022Comments
-
Markus almost 2 years
I use Python 3 (I also have Python 2 installed) and I want to extract countries or cities from a short text. For example,
text = "I live in Spain"
ortext = "United States (New York), United Kingdom (London)"
.The answer for countries:
- Spain
- [United States, United Kingdom]
I tried to install
geography
but I am unable to runpip install geography
. I get this error:Collecting geography Could not find a version that satisfies the requirement geography (from versions: ) No matching distribution found for geography
It looks like
geography
only works with Python 2.I also have
geopandas
, but I don't know how to extract the required info from text using geopandas.