Category Encoders published in JOSS
The Journal of Open Source Software (JOSS) is a formal academic peer-reviewed journal (ISSN 2475-9066), intended to be “developer friendly”. What does that really mean? If you have a project on github, that has some novel value, and is well written, then in probably 30-40 minutes of working time, you can put together a submission.
The reviews all take place in github issues, and I got some helpful feedback there, made changes, and eventually got accepted. The process end to end took around a month, and was pretty painless.
It now means that anyone using category encoders in their own research or work has a nice means to cite it, including the specific release, and that release will be available for others to use in reproductions via Zenodo (or github).
So if you have this kind of package, by all means, give JOSS a look, and if you use category encoders in research, please cite us:
https://joss.theoj.org/papers/d57818316816a19a80112892c3d12ed7
The Category Encoders Journey
This publication represents another milestone in the evolution of the category_encoders package. What began as an exploration of different encoding methods for categorical variables has grown significantly since being accepted into scikit-learn-contrib and made available on conda-forge.
The academic publication adds another layer of credibility and makes the package more accessible to researchers who need to cite their tools. It’s been rewarding to see how this project has evolved from a simple experiment to a tool used by data scientists around the world.
For those interested in the full story of how category_encoders grew from a weekend project to a widely-used library with millions of downloads, check out my reflection on the complete category_encoders journey.