I thought it'd be nice to scribble some words down about this journey. And I am here to talk about why and how I started, what I learned, and what difficulties I faced against along the way.
Roughly speaking, I spent about 4 months on this project and mostly spending 1h - 1.5h per week on average.
Why build it?
There is another KWIC package in the npm registry that I didn't try or run myself, but I read the source code and I realized that the package does more than it is supposed to, like stopwords, punctuation removal, etc.
As I greatly believe in UNIX philosophy that each program should do one thing and do it well, I decided to give it a shot and create a new package that does one thing: finding keywords-in-context.
The beginning of the project
The first step was to read about how to start a project in TypeScript. I was
then ready of setting
up the fundamentals, namely
As I've internalized the wisdom of taking a
moment of thinking and designing, before wildly starting
the implementation, I started off with a simply
README.md file where I
thought of how the API would look like. Therefore,
my first commit after the initial
didn't contain any serious code, but a concise
README.md file with some
Although I'm not a maximalist of Test Driven Development, I definitely
appreciated having tests in such a -relatively small- project that the output
can mostly be predicted.
My next step
was setting up a test suit with
ts-jest. As I moved forward in the
progress, having tests was such a relief.
First unplanned feature
After some progress, I was thinking that it could be a good idea to create an example app along with the package to present the package capabilities.
When I started developing the example app meanwhile, I realized that I can show
the highlights of the words inside the
<textarea> where users can edit and
see the matches in the same place.
Although it was definitely possible to implement it on my own, I decided to go for
a solution already there. I come across a react package, called
that highlights the text inside the
At this point, I realized that I have to find the range position of the matched words, a simple concordance matching wouldn't be enough. I decided to re-consider my plan and tackle it as next.
Solving the first bug
I quickly implemented an algorithm that calculates the range of the words from all the matches. Although it seemed to work in the beginning, after some testing with the app, I realized that it's broken by design.
It was because I had an assumption about the tokenization that was stripping all extra whitespace away and was only tokenizing the words. I should have tokenized not only the words but also the extra spaces. Even though I was thinking that is the right direction, I wanted to be extra sure. I've then checked the similar packages both in Python and R, and I found out they have similar approach in terms of tokenizing text.
In principle, I always do reviews of the other work to learn about the subject more and to validate my first ideas; however, this time, I admit that I was a bit late to check that.
Later, I changed how the tokenization happens (#3) and I am glad that it worked well.
I can say that my first serious experience with TypeScript packaging is quite good. I haven't had any good chance before to create a finished project with typed languages. Those are the main things I really liked about TypeScript:
- Static typing changes the mentality of how you program
- TypeScript LSP shines for auto-suggestions and linting
The tech stack I used to build is
ts-jest for testing;
tsc-alias for absolute path imports;
eslint for linting;
prettier for formatting.
Although I'm fairly content, I was a bit surprised that some of the TypeScript tooling still relies on external packages that I think they should have been handled by TypeScript long term ago, e.g. absolute path resolution.
After the ranges feature, and a working example application, I was ready to publish the package to the npm registry. I had to unpublish the first few versions due to testing, but that went okay, as it was okay to unpublish a package within 72 hours.
Probably due to the the notorious leftpad incident, npm has a strict policy when it comes to unpublishing packages. However, I couldn't publish the package with the same version even though it's deleted from the npm, I had to bump the patch version up. I don't know how but it seems like npm reserves the version numbers even though the packages are removed. But maybe they do it only for a while.