My problems with small libraries

There is a new supply chain attack every week. This article isn’t about them. I have other concerns with small dependencies: they suck for the developer experience.

Big frameworks work better

Small, modular, and composable libraries sound appealing, but they make you responsible for ensuring all your dependencies are:

compatible with each other
maintained to some extent (at least CVEs are fixed)
has good performance
won’t be that much of a pain to replace when it inevitably becomes necessary

Comprehensive frameworks handle most decisions, reducing the burden and risk of evaluating countless small libraries. Not because I’m incapable of evaluating and comparing 5 libraries a day, but because it’s not as interesting a challenge as it might sound. It’s repetitive and unrewarding, so at the end, you’ll choose based on vibes.

Another big benefit of frameworks is that you can become a good $FRAMEWORK developer. You will never be an expert on the random 500 small modular libraries that cobbles your project together.

It is possible to learn a framework. It probably ships with a well-written, well-thought-out documentation and guidelines, making onboarding and handoff much easier than a codebase built from a patchwork of small packages.

I can’t imagine having these benefits on a project that uses small, “composable” packages. I already smell spaghetti, and every new file I open might import a dependency with 2 weekly downloads, and the documentation is nowhere to be found. Maybe it used to exist, but now the domain leads to a scatchy site doing SEO hacks.

The HTTP API wrapper

I’m especially suspicious of HTTP API client libraries. I myself made the mistake of developing one, which allowed me to experience its drawbacks firsthand. Later, I had the opportunity to experience the same problems as a user of such libraries, so I can confidently believe that it wasn’t just my own incompetence.

The problem with these libraries is that they need to be kept in sync with the API. When new endpoints, fields, or data structures are added, the maintainer has to implement them. I have been stuck with a Stripe library that couldn’t do the one thing I needed. Maybe there are escape hatches: you can call a custom endpoint instead of using the built-in function. But then you lose almost all the benefits of having an API specific client library.

Error handling is also often a pain. Maybe the library developer thought anything other than 200 should raise an exception, but maybe in my specific use case, a 404 is a totally valid response.

Anecdote #1

Recently, I needed to scrape a website. Picked up the most popular selenium client library and... Couldn’t connect to my selenium instance. After hours of debugging, I figured out that the library code was adding hard-coded values to my connection params that were neither correct nor needed by me. I was lucky and found a way to overwrite the connection function, sending only my correct params. Connection finally succeeded. Just to fail right on the first click() function I tried to call. Even more hours of debugging, but I can’t seem to find the cause.

At this point, I’m starting to suspect that this library is more trouble than it's worth. I ask my trusty AI companion if it’s viable to ditch the library and implement my own. It swears to me that it would be insanely complex. The more detailed reasoning includes that I would have to “manage the session manually”.

Eventually (not trusting my trusty AI companion), I go and look into the Selenium specs. Turns out it’s a very simple HTTP API, using JSON, with fairly high-level endpoints like /click. And the session that I need to manually manage? It’s a single session ID I receive in the response to my POST request to /session, and I have to include it in subsequent API calls.

I understand why a dedicated library sounds good at first, but if the underlying protocol is something as simple and widely used as HTTP, I think we should just use the HTTP client that we are most familiar with. It’s almost always less overhead than wrapping the wrapper to fit into our business logic. And no going back and forth between the API and library docs to find the matching features.

The library is smaller than the smallest useful unit of code

Another case I observed is when a library is just too small to be useful on its own. You’ll need to be familiar with a couple of other libraries and compose them together.

Anecdote #2

I worked on a project where 2 different ORMs were used: one supported only DDL, and the other only DML. For someone new to the codebase, it wouldn’t be obvious that it’s two separate libraries with two completely separate DB connection stacks.

Unavoidably, one will try to call functions from both libraries in the same migration. There are no compilation errors or runtime errors. Both functions find their own connection and execute what you need. Except when there is an exception, and weirdly, only one operation is rolled back. After hours of debugging, you realize your 2 operations can’t share a transaction.

There is no happy ending, the libraries are cemented into the project, you won’t add a 3rd one, and this is not fun to work around.

Exceptions

I want to make it clear that there are perfectly valid situations when you don’t need a framework. Those cases are well-understood, self-contained problems. For example, a JSON serialization library that itself has 0 dependencies and needs 0 knowledge about the type of data you are going to serialize with it

But generally, I’d pick a big framework I hate than try to connect hundreds of small libraries I like one by one.