Quantum Stat
We Build NLP for the Bravehearts ✌ quantumstat.com
Image for post
Image for post
Botticelli

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

Ultima Ratio Regum

Hey welcome back! EMNLP happened this week 👀. Tons of research came out and this newsletter won’t do justice to all of the great research conducted by institutions worldwide. But first…

We will be releasing an update to the Big Bad NLP Database this week and also a large update to the Super Duper NLP Repo after Thanksgiving. These updates will be delivered via our email NL, if interested, you can sign-up on our homepage.

As always, if you enjoy this read, please give it a 👏👏 and share with your enemies. 😁

Ok, knowledge graphs time: Once again, Michael Galkin released his incredibly detailed round-up newsletter 🔥🔥. After a strong start in 2019 for knowledge augmented language models, it seems they continue to be the hot ticket for this year. Below is the TOC and link to full blog post (*warning* its extensive and…


Image for post
Image for post
med

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

Dead Languages

Welcome back! Plenty of things to talk about. We have a few conferences coming up over the next few weeks: EMNLP 2020 starts tomorrow and then we have everyone’s favorite NeurIPS beginning on the 6th of December.

Also, we updated the El Grande y Mal Base de Datos de NLP this week and added 25 new datasets. Highlights include the Russian SuperGlue and Chinese Clue Benchmark! 😎

Having difficulty staying ahead of the NLP research curve? Maybe this is why…

Image for post

Speaking about EMNLP and NeurIPS. Paper Digest aggregated a few papers from these conferences and included their code links, enjoy. …


Image for post
Image for post

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

The Last Crusade

This past week, the popular youtube-dl repo on GitHub was taken down by a DMCA notice (a.k.a. “take this down or we’ll sue you”) from the Recording Industry Association of America (RIAA) causing an epic unraveling of major proportions. If you are not in the loop, this repo housed a HUGE trove of downloaded YouTube videos (and other stuff) enjoyed by MANY developers in the open-sourced community.

fallout:

Ever since the takedown notice, hackers have been going nuts online mirroring countless “youtube.dl” repos (Streisand effect). They even used steganography to embed videos in picture source code 😭😭😭. …


Image for post
Image for post
Photo by Youhana Nassif on Unsplash

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

On Her Majesty’s Service

Was knocking about the Big Bad NLP Database when I arrived on a unique dataset. This lot, and it’s accompanying GitHub repo, struck me as bizarre, first, because of its content and second, because of its author/sponsor. So I went down the rabbit hole.

The dataset, called re3d, was created by a couple of consultancies in the UK on behalf of the Defence Science and Technology Laboratory (DSTL), which is part of the ultra-secret Porton Down government facility. The tech lab is a daughter agency of the UK Ministry of Defense and you can think of it as the UK’s version of DARPA/Skunkworks. And they have an “interesting” history. But why does one of the UK’s most secretive labs have an interest in NLP, and more specifically, an entity/relation extraction dataset? …


Image for post
Image for post
Photo by Fatih Yürür on Unsplash

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

Crash Override

Hey welcome back! It’s a brand new week. And if you are unaware, we tapped into the gold mine of arXiv’s new code link feature (mentioned in last week’s newsletter). Recently, arXiv collaborated with PapersWithCode to conveniently link any associated repos to its accompanying paper on the abstract page (which is much better than stalking the PDF). 👇

Image for post
Image for post
declassified

Well… we wanted to know if we could extract links to say… all of the NLP-related papers published in a trailing 5-day week?!?! …


Image for post
Image for post
Photo by Ankhesenamun on Unsplash

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

An Ode to Code

Hey, welcome back! We have another great NLP Cypher for this week. And as always, if you enjoy the read, please give it a 👏👏 and share it with your enemies.

Also, we updated the Big Bad NLP Database and Super Duper NLP Repo with 14 news datasets and 5 new notebooks. Would like to thank Tathagata Raha, Don Tuggener, and Yanai Elazar for their awesome contribution. 👩‍💻

New: one of the notebooks includes “speech-to-text” inference from Silero 😁

FYI: the majority of the BBND update came from the CodeXGLUE benchmark (released last month) with several datasets used for machine learning in code-related tasks (i.e. code completion or code-2-code translation just to name a few). …


Image for post
Image for post
Photo by Clark Van Der Beken on Unsplash

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

State of the Big Bad NLP Database

Welcome back. Today’s newsletter will be a little different. We want to update everyone on the current state of the Big Bad NLP Database 😎.

First, want to mention that today, we updated the database with 10 new datasets taking us past 600 total or, 50 new datasets for the month of September! We want to take this time to thank everyone who contributed to the BBND over the past year to make this repository one for the ages.

It all started in January of this year when we released the first version of the database with ~150 datasets. At the time, we thought it was already a big deal 😂 and we didn’t expect to grow it any further, but over time, as more and more datasets kept rolling in, we just kept adding. And as it stands, it is the biggest NLP repo in the world. …


Image for post
Image for post
Photo by Vinícius Henrique Photography on Unsplash

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

Legends

Hey…welcome back! Loads of research came out this week! But FYI, we couldn’t fit every story in this newsletter for space-saving reasons, so if you want complete coverage, follow our twitter, and as always, if you enjoy the read, please give it a 👏👏 and share with your enemies.

And….yesterday, another update was made to the Super Duper NLP Repo and the Big Bad NLP Database: we added 10 datasets and 5 new notebooks. Highlights include the DialogRE dataset which may be the first human-annotated dialogue-based relation extraction dataset. And the FarsTail dataset, a Persian language NLI dataset. 😎

👁 Random Tales from the Dark Web 👁

Legend has it there’s a bitcoin wallet worth $690 million that hackers have been attempting to crack for the past 2 years according to cybersecurity expert Alon Gal. The wallet’s .dat file contains a very tough encryption and it is all of what stands between hackers and a potential big pay-day. But some say this is all nonsense, and the veracity of this claim is questionable, but who knows…🤷‍♂️. Recently there’s been a “thriving market for selling uncracked wallets” on the dark web according to Gal. It may be our lottery ticket out of the matrix! …


Image for post
Image for post
Ibrahim Jabbar-Beik

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

EMNLP and Graphs 😵

☝ Persian art is pretty. Welcome back for another week of the Cypher. Yesterday, we made another weekly update to the Big Bad NLP Database and the Super Duper NLP Repo. We added 10 datasets and 6 new notebooks. This update was a good one since we added PyTorch Geometric notebooks for graph neural networks in case you all are feeling a bit adventurous.🙈

BTW, if you enjoy this newsletter please share it or give it a 👏👏!

Detour: I’ve been experimenting with onnx runtime inference on BERT question answering. The latency is significantly improved with ONNX which is currently running on “okish” cloud CPUs, the latency range is between 170–240ms. …


Image for post
Image for post
The Ninth Wave (1850) Ivan Aivazovsky

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

Aere Perrenius

Welcome back. Hope you enjoyed your week! We have another update that came in this Friday. Added another 11 datasets and 5 new notebooks. Want to thank all who contributed going back all the way to last week’s update! Won Ik Cho, Gayatri Venugopal, Moin Nadeem Himanshu Chaudhary, Vincent Wu, Prafull Sharma, Yeshwanth Reddy & Manu Romero!

In case you are feeling adventurous: The truth is out there for Lex Fridman, who out of left field, interviewed Cmdr. Fravor, an F/A-18 fighter pilot who engaged a UFO back in 2004 off the coast of Southern California, known colloquially as the “Nimitz incident”. …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store