Carrierwave, ClamAV and Clamby
If you are building a web application, you definitely will want to enable file uploading. File uploading is an important feature in modern-day applications. Carrierwave is a famous ruby gem that works perfectly with Rack based web applications, such as Ruby on Rails to provide file uploading out of the box with a long list of other features around this speciality.
If you have a file upload on your web application and you do not scan the files for viruses then you not only compromise your software, but also the users of the application and their files.
To avoid such scenarios we tend often to whitelist allowed file extensions and content types. This approach might not be enough if you decided to allow/whitelist executable uploads or if the attacker is uploading a malicious image or any file of an allowed file extension or content-type.
In this tutorial, I will show you how to utilize Rails ActiveModel::Validator
class to build a modular validator to scan each file upload in real-time using ClamAV and Clamby gem.
ClamAV® is an open source antivirus engine for detecting trojans, viruses, malware & other malicious threats.
Clamby gem depends on the clamscan
daemons to be installed already. If you installed clamscan
and tried to run Clamby, you will notice that it takes few seconds (around ~10 depending on available computing resources). This is because every time you run a scan, a new process of clamscan
gets initiated to run the scan which takes some time to load the antivirus database, check viruses signatures, run other boating routines and finally start the actual scan.
To overcome this issue. Clamby creator is highly recommending to use the daemonize
set to true
option. This will allow for clamscan
to remain in memory and will not have to load for each virus scan. It will save several seconds per request.
The bad news is a single process of ClamAV is consuming an average of 600-800MB.
For every rails server/pod running you will consume such expensive memory for nothing but preloading the viruses database in memory to deliver real-time antivirus scans!
Fortunately, ClamAV has a TCP/IP socket based interface. Which means we could run a single shared process and access it remotely using TCP/IP sockets. Or even better to run a cluster of distributed processes and loadbalance the virus scans across them. This sounds like a good plan 👌.
Assumptions And Prerequisites
The following part of this post will show you how to deploy ClamAV as a service on K8s, access it from other pods (Rails) over a TCP/IP socket and how to configure Rails to utilize this service in a modular and DRY implementation.
This post makes the following assumptions:
- You have basic knowledge of how to build Docker images.
- You have a Docker environment running.
- You have a Kubernetes cluster running.
- Your Ruby on Rails application is containerized and running on Kubernetes.
- You have the kubectl command line (kubectl CLI) installed.
Step 1: Deploy ClamAV as a service on Kubernetes
To deploy ClamAV on Kubernetes, you need to configure a kubernetes deployment and make it accessible through a kubernetes service. The service will expose the deployment using a FDQN DNS that loadbalances the traffic to the deployment replicas without any unfamiliar service discovery mechanisms (which makes the antivirus horizontally scalable).
- The kubernetes deployment will look like:
- The exposing service will look like:
Now, you can create the deployment and its exposing service using kubectl as follows:
Step 2: Configure Clamby to use ClamAV service
As shown in the previous step, ClamAV is now up and running as a kubernetes deployment with 1 replica (you could add more replicas to make it horizontal scalable) and listening to port 3310 with protocol TCP. Also, the kubernetes service will make sure that the traffic going to antivirus-svc.shared.svc.cluster.local
is being load balanced across the replicas automagically.
To configure Clamby ruby gem to connect to the ClamAV daemon at antivirus-svc.shared.svc.cluster.local
using port 3310
and over TCP
sockets we need to use the following Rails initializer:
This initializer is instructing the Clamby gem to use a clamav config file located at: /etc/clamav/clamd.conf
. This file is not created yet but we will now create it as a part of building the RoR docker image used to run the application.
So, your RoR Dockerfile should look something like:
Now, if you run rails c
from a container running on the kubernetes cluster and using this Dockerfile image. Then you should be able to run the following command to do ClamAV scans using the remote service over TCP:
# rails c
Loading development environment (Rails 5.2.3)
[1] pry(main)> Clamby.virus?('SOME_LOCAL_FILE_PATH')
ClamAV 0.101.1/25431/Fri Apr 26 08:57:33 2019
/app/SOME_LOCAL_FILE_PATH: OK
false # no virus 🎉
Step 3: An activemodel validator to utilize Clamby
After getting all of the infrastructure in place for running ClamAV as a remote service over TCP and configuring the RoR app to connect to it. It is time to write a modular, DRY and reusable ActiveModel validator that could be used to scan every file the user uploads in real-time.
An ActiveModel validator could look like:
Then you could use the validator with the following one line inside any ActiveRecord model:
Whenever you need to scan a file uploaded by a mounted uploader in an ActiveModel object, all you need to do is to add the following validation to the model:
validates_with AntivirusValidator, attribute_name: 'image'
Because the ClamAV process is preloaded, up and running already on the remote deployment. and because the deployment is running on the same kubernetes cluster so all traffic goes local. A file scan process takes ~20ms for small files < 1MB and little bit more for bigger files. Do not hesitate to scan every single file uploaded by the end users as the process is not expensive and everything is now in-place to do scans with an extra one line of code.
Happy virus 🦠 scanning 👋