The Data Science Docker image you were waiting for!
When looking for an appropriate Docker image for my Data Science projects, I could not find any that matched my requirements — although they are not so unusual.
What’s important in Data Science (for me)
Many points in the list are already covered by the jupyter/all-spark-notebook Docker image.
To also include the missing points, I created and published my own Docker image franzdiebold/datascience-ultimate:
To give you a sneak peek, this is how JupyterLab looks like when using the image:
How can I use it?
The fastest way is to just run the following command in your shell:
docker run -p 8888:8888 -p 4040:4040 franzdiebold/datascience-ultimate
This will start JupyterLab in a Docker container and the following web apps will be available:
- JupyterLab: http://localhost:8888/lab/
- Spark Web UI: http://localhost:4040/
There’s an even better way
You probably want your local files and folders to be accessable from within the container. Therefore, we need to mount the current directory ($PWD
) to the working directory (/home/jovyan
) in the container:
docker run --rm -p 8888:8888 -p 4040:4040 -v "${PWD}":/home/jovyan franzdiebold/datascience-ultimate:latest
You want even more?
If you want to learn more about how to dockerize Data Science, then check out my article “How to dockerize Data Science”:
If you like the Docker image:
- 🖥 Use it!
- ⭐️ on GitHub: FranzDiebold/docker-datascience-ultimate
- ⭐️ on Docker Hub: franzdiebold/datascience-ultimate
- 👍 Share it!
Thanks! 🙏
If you appreciate this post, here are a few things you can do to support my work:
- Give this story a clap. 👏
- Subscribe to my upcoming stories.
- Follow me on GitHub: https://github.com/franzdiebold