How to scrape articles from website
Aug 10, 2021
Hello everyone In this article we will be looking at how to scrape articles from website using python package called newspaper.
Thanks everyone let’s get started this a short article we will be using my medium story to show you how this Is done feel free to use It also for you nlp project.
Install newspaper3k library
! pip install newspaper3k
Article
from newspaper import Articleurl = 'https://lerekoqholosha9.medium.com/data-preprocessing-with-pandas-23728a06cec5'article=Article(url)article.download()print(article.html)
Parse article
article.parse()article.authors
article.title
article.publish_date
article.tags
print(article.text)
print(article.top_image)
NLP
import nltknltk.download('punkt')article.nlp()
article.keywords
print(article.summary)
Conclusion
we have finally scrape the article with python you can look at the documentation for more features of this package.
Thank you for reading.
Please let me know if you have any feedback.