First of all, let’s talk about FluentD, which is an open source data collector for unified logging layers. It allows you to unify data collection and consumption for a better use and understanding of data.
In general, it has a large 500+ plugin system that allows the community to extend its functionality. It is highly used and reliable, requiring minimum resources to work. Besides, it works with a Json unified logging layer.
PubSub
Pub/Sub is an asynchronous messaging service that decouples services that produce events from services that process events.
It can be used as messaging-oriented middleware or event ingestion and delivery for streaming analytics pipelines.Pubsub works with topics and subscribers; each message published in the topic will be sent to all the subscribers.
By making use of Fluentd and Pub/Sub in GCP, logs can be collected and sent to different stacks such as ELK. Currently, this activity has become a critical part of the Infrastructure administration.
In the Cloud Console, on the project selector page, select or create a Cloud project.
Part of this deployment is a Pub/Sub configuration, In the main bar type ‘PubSub’
A subscription is going to be used for external systems to receive the messages. This could be ELK, Datadog, or any other monitoring tool.
4.1 Type your subscription name and select the topic that was created in the prior step.
Two service account keys are going to be needed to publish and subscribe in Pub/Sub.
5.1. In the main bar, select ‘IAM & Admin’ and select ‘Service Accounts’
5.2 Next step, click on ‘Create Service Account’
5.3 Fill in the ‘Service Account’ name and description, click ‘CREATE’:
5.4 Select the role ‘Pub/Sub Publisher’:
5.5 Select the created role, and click on ‘Create Key’
5.6 A JSON file key will be automatically downloaded, rename it as publisher.json. We will use this file in Fluentd forwarder’s configuration
Repeat the steps to create one more service account for Pub/Sub subscription and save the key file as subscriber.json
For this example, we assume that Ngingx is running in a Google Cloud Virtual Machine with Ubuntu Linux.
Run the following commands:
# install td-agent 4 curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent4.sh | sh |
# Prepare the development libraries
sudo apt-get install libgdbm-dev libncurses5-dev automake libtool bison libffi-dev |
# install the gcloud pub sub plugin to push the logs to Pub/Sub sudo /usr/sbin/td-agent-gem install fluent-plugin-gcloud-pubsub-custom |
# Replace the existing fluentd config file with new file cd /etc/td-agent/ mv td-agent.conf td-agent.conf.old touch td-agent.conf |
The following is the configuration file of Fluentd, which retrieve the nginx logs and match allows data to be uploaded to pubsub
<source> @type tail path /var/log/nginx/access.log pos_file /var/log/td-agent/nginx-access.pos tag example.publish format nginx </source> <match example.publish> @type gcloud_pubsub project [project-id] key /home/ubuntu/publisher.json topic [topic-name] autocreate_topic false max_messages 1000 max_total_size 10000000 flush_interval 1s try_flush_interval 0.1 format json </match> |
Fill the following information:
[project-id] Project ID created in step 1
[topic-name] Topic created in step 3
And remember to use the subscriber.json key created in step 5
#Start the td agent using the command below service td-agent start |
#Verify the status of the service
systemctl status td-agent.service |
Receiving logs in Pub/Sub
Whenever a request is made to the webserver, logs in /var/log/nginx/access.log are going to be pushed to Pub/Sub as displayed in the following image.
The FluentD aggregator will collect the logs from Pub/Sub and push them to ElasticSearch.
In addition, logs could be pushed to Google Cloud Storage although this part of the deployment will not be covered in this example.
# install td-agent 4 curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent4.sh | sh |
# install the Google Cloud pub sub plugin to pull the logs from Pub/Sub sudo /usr/sbin/td-agent-gem install fluent-plugin-gcloud-pubsub-custom |
# Install the elasticsearch plugin to push the logs to elasticsearch sudo /usr/sbin/td-agent-gem install fluent-plugin-elasticsearch |
# Create a new fluentd config file cd /etc/td-agent/ mv td-agent.conf td-agent.conf.old touch td-agent.conf |
td-agent.conf content for the aggregator:
<source> @type gcloud_pubsub tag example.pull project [project-id] topic [topic-name] subscription [subscription-name] key /home/ubuntu/subscriber.json max_messages 1000 return_immediately true pull_interval 2 format json </source> <match example.pull> @type elasticsearch include_tag_key true host [elastic-search lb ip] port “9200” logstash_format true <buffer> chunk_limit_size 2M flush_thread_count 8 flush_interval 5s retry_max_interval 30 queue_limit_length 32 retry_forever false </buffer> </match> |
Replace the following values:
[project-id] Project ID created in step 1
[topic-name] Topic created in step 3
[subscription-name] Subscription created in step 4
[elastic-search lb ip] with your load balancer IP
And remember to use the subscriber.json key created in step 6
#start the agent service td-agent start |
After this step is completed, you should now be able to start getting logs in your ElasticSearch installation.
Credits:
Written by : Diego Woitasen
English language corrections: Jesica Greco
2018, Cryptoland Theme by Artureanec - Ninetheme