AI Analysis of Historical U.S. Newspapers Reveals New Tools For Mining the Past

  • 📰 DiscoverMag
  • ⏱ Reading Time:
  • 34 sec. here
  • 2 min. at publisher
  • 📊 Quality Score:
  • News: 17%
  • Publisher: 53%

Technology Technology Headlines News

Technology Technology Latest News,Technology Technology Headlines

Deep learning techniques are turning newspapers records into valuable research tools

In 1914, the biggest story in newspapers across the U.S. was the world war that had recently broken out in Europe with a big question mark hanging over whether the U.S. would take part. The same story dominated the U.S. newspapers in 1915, 1917 and 1918. But in 1916, another story captured the attention of the American public, one that is much less well known today.

At least, it was until Melissa Dell at Harvard University in Cambridge and colleagues entered the scene. This group have created a deep learning algorithm that detects the newspaper layout and recognizes the difference between types of text. It then uses optical character recognition to read the stories while clearly labelling the headlines, bylines and captions and ignoring adverts.

They then picked the largest cluster of stories in each year and manually read a sample of stories from each cluster to confirm the topic. That produced a list of the biggest stories for each year from 1885 to 1920, including 1916 when Pancho Villa dominated headlines.

 

Thank you for your comment. Your comment will be published after being reviewed.
Please try again later.
We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

 /  🏆 459. in TECHNOLOGY

Technology Technology Latest News, Technology Technology Headlines