Vector: The Logstash Replacement That Uses 20x Less Memory

Logstash is written on the JVM and consumes 500–800 MB of memory just on startup. On a VPS with 2–4 GB RAM that is a third of the resources spent purely on log collection. Vector is written in Rust, starts in a second, and uses 20–50 MB in a basic configuration. It matches Logstash functionally — accepts logs from any source, transforms them, and ships them anywhere.

Installation

Debian / Ubuntu — official script:

curl -1sLf 'https://repositories.timber.io/public/vector/cfg/setup/bash.deb.sh' | sudo bash
sudo apt install vector

Check version and status:

vector --version
sudo systemctl status vector

Configuration file:

sudo nano /etc/vector/vector.yaml

After changes — validate and restart:

vector validate /etc/vector/vector.yaml
sudo systemctl restart vector

Concept: sources → transforms → sinks

Every Vector pipeline consists of three component types:

Sources — where to read logs from. Files, systemd journal, syslog, stdin, Kafka, HTTP.

Transforms — what to do with events. Parse, filter, rename fields, add metadata.

Sinks — where to send. Files, Loki, Elasticsearch, S3, ClickHouse, HTTP, stdout.

Each component gets a unique ID. Sinks declare inputs — which components to pull data from. This builds a processing graph.

Simplest Example: Read a File, Write to Another

sources:
  app_logs:
    type: file
    include:
      - /var/log/myapp/*.log
    read_from: beginning

sinks:
  output_file:
    type: file
    inputs:
      - app_logs
    path: /var/log/vector/myapp-%Y-%m-%d.log
    encoding:
      codec: text

Read From systemd Journal

Source for system logs — all services through journald:

sources:
  journal:
    type: journald
    include_units:
      - nginx
      - php8.1-fpm
      - postgresql

Without include_units — reads the entire journal. On a busy server that is a lot.

Parsing the Nginx Access Log

Nginx writes logs in a text format. Vector can parse them into structured fields using VRL (Vector Remap Language):

sources:
  nginx_access:
    type: file
    include:
      - /var/log/nginx/access.log

transforms:
  parse_nginx:
    type: remap
    inputs:
      - nginx_access
    source: |
      . = parse_nginx_log!(string!(.message), "combined")

sinks:
  parsed_logs:
    type: file
    inputs:
      - parse_nginx
    path: /var/log/vector/nginx-%Y-%m-%d.json
    encoding:
      codec: json

After parsing each event contains fields: client, method, path, status, size, referrer, agent.

VRL: The Transformation Language

VRL (Vector Remap Language) is a built-in language for working with events. Syntax resembles Python but is type-safe.

Add a hostname field:

source: |
  .hostname = get_hostname!()

Rename a field:

source: |
  .ip = del(.client_addr)

Filter only errors (drop everything except 4xx and 5xx):

source: |
  if !starts_with(to_string!(.status), "4") && !starts_with(to_string!(.status), "5") {
    abort
  }

abort in VRL stops event processing — it never reaches the sink.

Parse JSON from a field:

source: |
  .parsed = parse_json!(.message)
  .level = .parsed.level
  .msg = .parsed.msg

Sending to Loki

sinks:
  loki:
    type: loki
    inputs:
      - parse_nginx
    endpoint: http://localhost:3100
    labels:
      app: nginx
      env: production
      host: "{{ hostname }}"
    encoding:
      codec: json

Labels are the key to Loki performance. Keep them few (3–5) with low cardinality. Never use IP address or user-agent as a label — that creates millions of streams.

Sending to Elasticsearch

sinks:
  elasticsearch:
    type: elasticsearch
    inputs:
      - parse_nginx
    endpoints:
      - http://localhost:9200
    index: nginx-logs-%Y.%m.%d
    auth:
      strategy: basic
      user: elastic
      password: "${ES_PASSWORD}"

Environment variables via ${} — passwords are not stored in plain text in the config.

Routing: Different Logs to Different Destinations

One source, multiple destinations based on content — using the route transform:

transforms:
  router:
    type: route
    inputs:
      - all_logs
    route:
      errors: '.level == "error" || .level == "critical"'
      info: '.level == "info" || .level == "debug"'

sinks:
  errors_to_loki:
    type: loki
    inputs:
      - router.errors
    endpoint: http://localhost:3100
    labels:
      severity: error

  info_to_file:
    type: file
    inputs:
      - router.info
    path: /var/log/vector/info-%Y-%m-%d.log
    encoding:
      codec: text

Aggregating Metrics From Logs

Vector can count metrics directly from logs and expose them to Prometheus:

transforms:
  count_errors:
    type: log_to_metric
    inputs:
      - parse_nginx
    metrics:
      - type: counter
        field: status
        name: nginx_requests_total
        tags:
          status: "{{ status }}"
          method: "{{ method }}"

sinks:
  prometheus:
    type: prometheus_exporter
    inputs:
      - count_errors
    address: 0.0.0.0:9598

Prometheus scrapes http://server:9598/metrics and gets nginx_requests_total by status and method — no separate nginx-exporter needed.

Debugging: See What Flows Through the Pipeline

Output events to stdout during config development:

sinks:
  debug:
    type: console
    inputs:
      - parse_nginx
    encoding:
      codec: json

Built-in tap — watch a live event stream through a specific component:

vector tap parse_nginx

Comparison With Logstash and Fluentd

Parameter	Vector	Logstash	Fluentd
Language	Rust	Java	Ruby
RAM at startup	~20–50 MB	~500–800 MB	~50–100 MB
CPU at idle	minimal	high	low
Transform language	VRL	Grok + Ruby	Fluent DSL
Performance	very high	medium	high
Metrics from logs	built-in	plugin	plugin

On a VPS with limited resources Vector is the obvious choice.

Quick Reference

Task	Command / config
Install	`sudo apt install vector`
Validate config	`vector validate /etc/vector/vector.yaml`
Watch live stream	`vector tap component_name`
Source — file	`type: file` + `include:`
Source — journald	`type: journald` + `include_units:`
Parse Nginx	`parse_nginx_log!(string!(.message), "combined")`
Filter in VRL	`if condition { abort }`
Routing	`type: route` + named routes
Metrics from logs	`type: log_to_metric`
Prometheus exporter	`type: prometheus_exporter`