21 · MAY · 2023

Conferences and Sources for Scientific Papers

FLAVIO CLESIO · 4 min

One curiosity I always had as a student was how professors kept themselves up to date with the state of the art in the literature they were simultaneously writing and reading.

Throughout my journey of asking many masters and PhDs from around the world, I came to the conclusion that each researcher has a specific and unique method that meets various criteria: time, disposition, access, academic agenda, and even life stage.

But why stay up to date?

I use my small channel Artigos de ML as a way to refine my knowledge on topics I have genuine intellectual curiosity about. This gives me a lot of freedom to think about things outside the news cycle or hype.

Not to mention that I have no academic agenda to fill, which I particularly enjoy and find liberating from an intellectual standpoint — I can let my curiosity guide me through a less structured self-learning process.

Even though it’s unusual for someone in the industry, I like staying up to date through academic literature for several reasons:

Intellectual straightjacket of the corporate world: Being in the corporate world, learning through practice often becomes a limiting factor in my understanding process, especially in terms of experimental rigor. There are few places where mistakes are seen as learning rather than losses. Most scientific work doesn’t have this limitation as severely, since there is freedom to understand theory and apply practice in a systematic and scientific way;

Jumping from the corporate off-road to a Formula 1 track methodologically — mid-race: Depending on the research, an initial path toward a solution has already been created and validated systematically, with its potentials and limitations laid out.

I particularly dislike following the results of papers (due to biases like confirmation, congruence, observer-expectancy, decoy effect, selective perception, and Semmelweis reflex) and instead focus on the methods these papers bring and how I can apply them day-to-day.

Thanks to this kind of prior exploration, I started using techniques such as Bag of Tricks in NLP, scalable boosting methods as early as late 2016, and depthwise separable convolution architectures for computer vision in 2017.

The corporate world counterpart to this is all the biases described above, mixed into a swamp of nonsense, various corporate pressures, and the misuse of tools like A/B tests in countless meaningless experiments driven purely by executives’ egos.

It’s a type of literature that keeps me in contact with the state of the art, the potentials and limitations of my field of knowledge — contact that helps me be a better-informed professional in both theory and practice;
Depending on the maturity level of a given area, I can gain a broad overview of how to navigate my field of interest more consciously and in an informed way.

Item 4 is where I see many people getting things backwards. I know Data Scientists and ML Engineers who are very talented but get “burned out” on something specific within ML and then jump to something entirely unrelated — like front-end or management — which are useful skills in the modern job market, but carry low transitivity between each other.

I want to write about this one day, but in short: having a portfolio of unrelated skills will only dilute the time needed to reach mastery in a topic (which at the end of the day is the fortress a professional needs), and won’t generate the compound effect (or snowball) that underlying complementary skills create in a long-term career.

What to read?

This is the most important question of all, because time is finite and thousands of papers are published every day.

I don’t have a simple answer, since it depends on each person’s academic and professional moment — but what I have for myself is a criterion that considers two dimensions: (i) the intensity of the moment (which can be professional, personal, or both) and (ii) the reading necessity.

a) Low intensity with no need for critical reading: Here I practice pure intellectual indulgence. I read whatever interests me with no specific criteria. Some readings range from bilingual development in early childhood and the cognitive questions involved, how most medical scientific findings are false, and how venting generally doesn’t work — and is actually pushing people toward states of permanent dissatisfaction and even aggression.

b) High intensity with no need for critical reading: Here I take a break from readings that demand more reflection, but I’m always looking at something more recent just to “know what’s happening” — nothing that will take my focus away from the current moment.

c) Low intensity with need for critical reading: This is the state when I know an interesting project is on the horizon that will require either an update in something I’ve been practicing for a long time, or when I’ll be leading some collective implementation effort that demands communicating with multiple people. An example was when I had to work on a computer vision project requiring algorithmic performance (i.e., Recall at top@5 above 90%) and system performance (i.e., response time below 50ms) — I had to read many papers to arrive at something satisfactory and explain it to data scientists + POs + SysAdmins + Operations with the rationale behind the choice.

d) High intensity with need for critical reading: In this type of situation I take a break from some activities.

Where to read?

Some people read better at work to make better use of professional time; some people like to read during waiting periods — what I call buffer moments; some people have a specific place to read, like their home office or outdoors.

Personally, I like to use waiting situations for my reading.

Everything counts: Uber rides, airport waiting lounges, flights, buses. These kinds of situations are generally “dead” moments where socialization usually won’t happen anyway, and the time will be lost regardless — so I try to capitalize on them.

However, I do have one specific day of the week dedicated to reading, and it’s something I try to maintain with almost religious consistency. Rain or shine, I’ll read on that day of the week — even if only for 25 minutes. I’ve done reading in police station waiting rooms, at pre-wedding parties, in maternity ward waiting lounges, during overnight funeral vigils, on the beach, and at children’s birthday parties!

My inspiration for this came entirely from Jerry Seinfeld and his method called “Don’t Break the Chain.”

This helps me build a habit in a simple way, and maintaining the discipline is very straightforward. Regardless of my mood or energy level, I’ll read on that day of the week.

I definitely do not recommend a schedule this strict for everyone, as it demands a very high level of family and social understanding depending on each person’s situation.

Conferences

I have a very clear separation of conferences by tiers. And here I don’t rank them qualitatively per se, but by a degree of priority in terms of reading and tracking. I don’t follow the H-index or any pre-built list from the academic community, because I have distinct interests.

Currently I have 3 tiers:

Tier 1 Events I actively track dates for, and once the proceedings are released I try to read them as quickly as possible if related to work or personal research. Most of the time these will be venues with more applied work.

Tier 2 These events have relatively high relevance in the mainstream and strong editorial lines. I try to follow them as much as possible.

Tier 3 Here are all the others where I only search for something truly actionable or an interesting idea that slipped through the tiers above. Some are from adjacent or more theoretical areas.

It’s important to note that these are very personal criteria that took me some time to arrive at.

An obvious example is that NeurIPS and CVPR are clearly the top conferences in Deep Learning and Computer Vision — but in my case, most work coming out of those venues falls more into the informational category (SOTA benchmarks, etc.) rather than something I can apply day-to-day, like RecSys, where every edition gives me a ton of papers to review because it directly impacts my daily work.

What about industry events?

With the technical hollowing-out of InfoQ events and the end of O’Reilly conferences, the industry events that interest me the most are those from the Linux Foundation — especially KubeCon.

But personally, I’ll confess that fewer and fewer events grab my attention — though that’s a topic for another post.