.gitignore does not unignore my file!

Sep 12^th, 2022

Today I ran into an interesting edge case that my colleague and I could not easily explain until we checked the git manual. I’m sharing the details because even after using git for 15 years, I was mildly surprised by the behavior. Luckily, it does make sense now that I understand the underlying reasons.

What’s .gitignore?

Anyone using git for version control is likely familiar with the .gitignore file where you can specify which file name patterns git will ignore by default when you perform various actions, such as git status, git diff and git add. It’s very useful to ignore generated files and temporary files. You can add a line that starts with a ! to make it NOT ignore files with that pattern. Patterns are evaluated top to bottom, and the last pattern that matches a file will determine if it’s ignored or not.

What we did

We wanted to add a simple .gitkeep file to preserve the tmp/pids directory in our project on all machines that checked out the project. This arguably trivial operation did not work and took me a while to figure out why it did not work.

# Lets git NOT ignore .gitkeep files, i.e. always check them in:
echo "!.gitkeep" >> .gitignore
touch tmp/pids/.gitkeep
git status
# ... tmp/pids/.gitkeep is NOT listed as new file

Why it did not work

So why is the file we just told git we want to NOT ignore still getting ignored? It’s because our .gitignore already contained this handy line that ignored all temporary files (because they have no place being checked in!)

/tmp/*

This line makes git ignores all files and directories inside tmp. It’s also the direct reason why our earlier change did not give the result we want. After some reading of the manual, I found this relevant bit (emphasis mine):

An optional prefix “!” which negates the pattern; any matching file excluded by a previous pattern will become included again. It is not possible to re-include a file if a parent directory of that file is excluded. Git doesn’t list excluded directories for performance reasons, so any patterns on contained files have no effect, no matter where they are defined.

It really makes sense from a performance perspective to not recurse into children of ignored directories to see if they happen to be not ignored.

TL;DR: you need to re-include an ignored parent directory if you want to customize rules for its contents.

How to make git really NOT ignore your file

Let’s update .gitignore using our new knowledge. The git diff after my changes:

 /tmp/*
+!/tmp/pids
+/tmp/pids/*.pid
+!.gitkeep

The order of operations here is:

Ignore /tmp/* and all its children
Add /tmp/pids back
Ignore all .pid files in tmp/pids (because those are the only ones generated there)
Globally enable .gitkeep

Alternative: use the –force

Alternatively, you can use git add --force tmp/pids/.gitkeep to add it while ignoring .gitignore rules. Arguably that’s faster than checking the manual and thinking about it, but it also prevents you from learning why it failed to work in the first place.

This worked for me ™, so I hope it helps you as well. If you run into issues, feel free to reach out to me via Twitter or email. I’m an experienced Ruby software developer with a focus on back-end systems and an obsession with code quality. I even have a few Ruby on Rails maintenance services that I offer. If you still need to upgrade to Rails 6, then grab the handy free checklist or reach out to have me do it for you.

System tests in rails 6 + rspec + vcr + capybara

May 27^th, 2021

Getting this setup working took me an hour of searching the web to get some of the interactions working, so I hope this helps you do it faster and avoid some pitfalls I stumbled into.

Summary

This post describes some of the obstacles I ran into getting proper JS-supported system tests working in my up-to-date Rails app, Infinity Feed.

It used to be complex to get Capybara working with a JS-enabled headless browser for your integration tests. This has gotten significantly easier with Rails 5.x and got another boost with 6.x.

If your knowledge about Rails frontend testing still stems from the Rails v4 or v5 era, you might be pleasantly surprised by how easy it can be now.

Setting the scene

For this blog post, the relevant bits of my stack are:

capybara 3.35
faraday 1.4
rails 6.1
rspec 3.10
rspec-rails 5.0
turbo-rails 0.5
vcr 6.0
webdrivers 4.6
webmock 3.13

You’ve got the usual suspects here that test Rails with RSpec. I use unit tests for models and other plain old Ruby objects (POROs), there are controller tests for specific interactions and there was a feature test intended to test my frontend interactions. This is done differently in the modern era by using system tests.

I use Faraday for HTTP calls, so VCR + Webmock is part of my test stack to intercept & record HTTP calls so they can be replayed without hitting the network during my tests. This makes tests more consistent and faster. Win-win!

I had a feature test written with out-of-the-box Capybara which appeared to work just fine, until I wanted to test that Hotwire Turbo was doing its magic to make automatic JS-powered HTTP calls to replace my HTML with new HTML. The JS did not get executed in my tests!

So, I had to figure out how to enable JS. Remembering how much of an ordeal this used to be, and how often the best practices changed, I started searching the internet for how it is done in 2021.

System tests have replaced features

Since Rails 5.0/5.1 there are system tests, which are similar to the old feature tests we had, but now Rails handles all of the overhead you used to have to do yourself. You can stop futzing with DatabaseCleaner and configuring Puma to run in-process. It’s all taken care of by Rails now.

Since Rails 6, integration with browsers for testing happens via the webdrivers project, which handles downloading and updating the browser for you. It just works! Just beware of unexpected VCR interactions (see below).

Migrating features to system tests

It’s really easy:

Move feature files from spec/features/ to spec/system/
Change type: :feature to type: :system if needed for these files

Add this:

  before { driven_by :selenium_chrome_headless }

To the top of your system test files to pick a driver that supports JS. You can use this to configure different browsers and screen sizes.

And now you have working system tests!

There’s a few extra things left to configure if you need them.

Devise integration

If you use Devise to handle your authentication, you should register its integration test helpers to be used in feature and system tests:

# spec/rails_helper.rb

RSpec.configure do
  config.include Devise::Test::IntegrationHelpers, type: :feature
  config.include Devise::Test::IntegrationHelpers, type: :system
  # ...
end

This enables the very useful sign_in helper function, so your tests can focus on testing actual features instead of always having to simulate a login.

VCR integration

VCR helps you intercept HTTP calls during tests, as I mentioned earlier. The thing is that Webdrivers automatically checks if the latest version of the browser is installed and downloads it if needed. As you can imagine, this did not play well with VCR.

My solution is to force the webdriver to check for an update before I configure VCR. VCR also needs to be told not to interfere with system test calls.

# spec/rails_helper.rb

# RSpec.configure do
#   ...
# end

# Load this and update the driver before loading VCR.
# If you don't, VCR will intercept the version check.
require 'webdrivers'
Webdrivers::Chromedriver.update

# Configure VCR to don't interfere with system tests
VCR.configure do |config|
  # 127.0.0.1 is for system tests so whitelist it
  config.ignore_hosts '127.0.0.1'

  # My personal settings, feel free to ignore/change
  config.cassette_library_dir = 'spec/fixtures/vcr_cassettes'
  config.hook_into :webmock, :faraday
end

Conclusion

This worked for me ™, so I hope it helps you as well. If you run into issues, feel free to reach out to me via Twitter or email. Besides being close to launching a smart RSS reader over at Infinity Feed, I’m a freelance Ruby software developer with a focus on back-end systems and an obsession with code quality.

Launching is scary

May 11^th, 2021

Until you decide to make your product available to the world, it’s just yours. You get to add to it, remove from it and change it however you want. You can do this without risk of anyone getting upset that you changed or removed their favorite feature. There is nobody to judge your product. Yet.

When you do launch, everything changes. Suddenly it’s not only yours anymore, it’s also theirs. Your users (who preferably turn into your customers) will start to use it. They develop attachments to what is there. They get frustrated with what they don’t like. They will, collectively, present you with conflicting requests to make changes.

The worst outcome is that nobody cares, and all that’s left is you and your disappointment.

Not launching, just yet, in order to fix that one bug, to tweak that one thing… it’s so tempting! You get to keep it for yourself, just for a little while longer.

The problem is that this mindset is similar to analysis paralysis: you can get stuck in it without ever launching. There is always something more you can do. Something to add, something to fix. It will never be perfect, but you can trick yourself into trying to achieve it. In reality, you’re just avoiding the potential for disappointment that could come from an unsuccessful launch.

I’ve read somewhere that if you are not at least a little ashamed of what you have launched, you have waited too long. It always sounded wise and useful, but now that I’m getting closer to launching Infinity Feed, my mind is spinning and grasping at all kinds of excuses for why I should wait just a little longer.

How I’ve been dealing with it? I’ve started talking to folks about what I’m building. I’ve hinted yesterday in the footer of my blog post that it will go live soon-ish. I’ve put up a marketing placeholder page (which read okay at the time, but now I just want to rip it off and replace it with my new one). Basically: I’ve started building momentum that should result in me launching the thing I’ve been announcing.

Now if you’ll excuse me, I have a few more things to do before I hit the “deploy” button for real. Launching really is scary, but this post pushed me another step closer to doing it.

Curious? Want to say something? Feel free to reach out to me via Twitter or email. Besides being close to launching a smart RSS reader over at Infinity Feed, I’m a freelance Ruby software developer with a focus on back-end systems and an obsession with code quality. I’ve also decided to look at cryptocurrencies again, so expect me to mention that again in the future.

Having another look at crypto

May 10^th, 2021

Addition on 8 September 2022: note that I never published a follow-up for this due to how obvious it is that a lot of crypto is a scam, and how ethically dubious a lot of practices are. As a layperson, assume you will get fleeced when you enter the ecosystem. The amount of tinfoil and conspiracy theories in this scene promote Fear/Uncertainty/Doubt (FUD) which aims at getting new people to add money to the crypto ecosystem in order to prop up prices for folks already invested. Sound familiar? Yep, it’s a huge pyramid scheme! See also: Line goes up, a good video that breaks down the ecosystem and issues.

Why now?

Two of my friendly colleagues disappeared into a crypto-shaped black hole over the last three months. They re-emerged, talking with great enthusiasm about it. They have doubled their seed money, and talk about all the potential for the technologies in the space. I must admit, they have broken down some of my long held skepticism, so it’s time to look at it again myself and see if their hype is real.

For context: I’m coming at this as a crypto skeptic software developer. I looked at Bitcoin when it was very new (5+ years ago?) and did not get it. It seemed like speculation on a virtual coin without intrinsic value or purpose. It smelled a lot like a pyramid scheme. Association with black market trade did not help. In the last few years the negative environmental impact of Bitcoin has not made me a fan either.

I’ve looked into speculative trade before. I’ve had a good look at ForEx in 2012-2013, doing paper trading, lots of reading, and some programmatic stuff, before deciding the speculative stuff was not really for me. Similarly I’ve read up on stock market investment, including Benjamin Graham’s classic book.

The crypto world has developed a lot in the last few years. It’s not just Bitcoin anymore, so let’s dive back in to see what’s up with it now.

In case it’s not obvious: I’m not an expert, this is not financial advise, etc. Feel free to point me in the direction of more solid info where I can learn.

Setting goals

It’s good to clarify what your goals are when you start with something. It helps you direct your search for information, and might help a bit against temptations.

My high level goals:

Try to understand how it works
Try to understand what you need to get started
Try to get a basic setup working to interface with the systems (buy/sell coins, use a smart contract, etc)
How do you make money? Look at fees, fundamentals and historic trends/behaviors… is there a minimum amount of money at which you “need” to start to mitigate overhead and fees? How is starting with €100 vs €1000 vs €10000? What trade strategies are there and how do you balance risk vs reward? Are certain coins/tokens better than others? Can stock/ForEx strategies be adapted, index fund strategies? Why (not)?
Actually try to make money if the previous goal resulted in something which gave me enough confidence that it might work. Again, try to come up with a plan that balances risk vs reward (I’m inclined towards spreading risk and draining profits at set percentage increases; ideally I withdraw a fixed percentage of my profits so eventually I’m only working with my profits)
Making stuff: how hard is it to write a smart contract?
Making stuff: how hard is it to build tools that interface with the systems? DAPs (?) / distributed applications are a thing.

The very basics: how does it work?

First goal: I want to understand how things work in crypto in 2021. Below is my understanding so far. It will probably be wrong on a lot of accounts, but that’s why it’s nice to write down so I can correct it later.

Which pieces are there on the market? What types of coins/tokens/contracts/etc are there? My very limited understanding right now is:

Coins. Stable or not. Serious or not. I’ve heard the term “shitcoins” multiple times already, so not all coins have the same esteem.
Tokens, fungible or not (fungible is another word for interchangeable, fiat currency is fungible, because each euro is the same as another euro. A non-fungible token is closer to a signed and numbered limited edition collector’s item, one is not the same as another). What is the difference between a coin and a token?
Smart contracts.
- Programming angle: is there one programming language, or many? How generic or domain specific is this? What are best practices? What are common pitfalls? How does debugging work? How does your toolchain look: is there versioning and source control? How does testing look? Static analysis? Are there code libraries or dependencies?
- Functional angle: how are they used? What other classes of products are stacked on top? I heard about contracts being deployed as neutral agents who just operate protocols defined by the contract (or by multiple contracts), with ownership voided so they truly are running the way they are until the end of time. How have folks dealt with bugs, viruses, etc? How does the infrastructure work that powers this? Who pays for it and how?
DeFi, decentralized finance. Mumble, mumble, “and then you buy your house via DeFi because banks don’t have money so they don’t underwrite mortgages anymore.” Yeah, I’m still a bit puzzled about this. This would involve pretty good real-world legal contracts to defer transfer of ownership to a smart contract, and to have the smart contract somehow enforce that the offline land registry (that’s what we have in the Netherlands at least) is updated to reflect the actual owner of a piece of land.

Where do coins/tokens come from?

I know Bitcoin has miners who burn CPU/GPU/FPGA time to calculate hashes in order to process transactions and create new bitcoins in the process. Transactions are relatively slow and expensive (a Google search led me to this which claims 1MB per 10 minutes of transaction bandwidth, currently you pay 3-10 USD per transaction depending on how much of a hurry you are in (10-60 min delay). That’s relatively little if if your trade is for thousands of dollars, but a lot for a cup of coffee.

Look into how/why Bitcoin has a 4-year periodicity in its prices. There’s something about regularly scheduled restrictions in how many new coins are mined, so total new supply decreases and as such value doubles. This makes huge waves and affects all coins, apparently. My friends keep talking about how the next few months will have huge gains and then it’ll dry up. Why?

How do the other coin types do this? There are Proof of Stake coins (vs Bitcoin’s Proof of Work) where there’s a finite supply of coins which are setup via smart contracts, have a buy-in ahead of time and then freely trade once they go live. There’s a thing about burning coins as part of the protocol so they get more valuable over time. There’s also a thing about staking claims, where you freeze your coins temporarily in exchange for a percentage of the transaction fees? Time periods here are daily or weekly, so it’s all incredibly rapid compared to the yearly rates traditional banks pay.

What are the fundamentals here?

Analogy to the stock market, where each stock represents a piece of ownership of a company. Shares earn dividends for you merely holding on to them. There’s also profit/loss based on the resale value of the share. Companies have revenue, profit, equity, capital reserves, etc. A lot of metrics you can contrast to the share performance.

Based on this you can choose to go long on a share (i.e. you buy it and just hold on to it), avoid it, or do risky things such as shorting (that is, you borrow a share to sell it now and promise to buy it back later; if it works you pocket the profit, if you fail your loss has no hard limit other than the share price). While at it: don’t every borrow money to invest. Only play with what you can afford to lose.

Do things like stop/loss orders exist in crypto? What are the new options that have no analogy in the classic markets?

How does it connect back to the real/offline world?

How can you use it, right now? Better question: how is crypt used right now? How mature are the products? What are common pitfalls, and how have those been mitigated?

How user friendly / approachable is crypto? How close are we to my parents (who are tech late adopters) using this for something useful? How close are we to them being able to explain it to someone else?

Tooling & usability

My friends mentioned that there’s a lot of rough edges and opportunity for quality of life tools. What would be useful? Is there a reason these don’t exist yet? Is there an opportunity to monetize some of this?

Risk management

To manage risk you have to be aware of risks. What are the obvious ones?

Crypto exchanges / brokers can run away with your coins, and apparently that’s just a risk you have to accept. Crypto is dog-eat-dog world, like the game Eve Online. Don’t trade with what you can’t afford to lose. How can you identify trustworthy ones, besides just going with what everyone is doing? If you can’t do this, how can you identify the (obviously) untrustworthy ones?

Wallets, ledgers, accounts. Somewhere to store your coins/tokens/etc. In my mind it’s close to how your SSH keys also control your bank account. Public/private key encryption. Is this model accurate? Be cautious and careful. What are good practices for keeping things safe?

Who are the big fishes in each pond, and what are their interests? When they move, they can generate huge waves. Stop/loss orders which normally might make sense could get triggered hard due to big fish movements. My friends mention that folks trading on margin tend to have their contracts enforced at the end of the day/week/month/quarter. The quarterly ones were unexpected to them due to the volatility they caused. Knowing what/when happens means you can anticipate it.

There is the obvious loss of value, loss of trade volume/liquidity, risk of the huge fishes in the market making splashes, problems with underlying things (i.e. if one coin is connected with others, they have a relation).

Where/how to start?

Writing this down helped me get a picture of what I think I know. Plenty of questions and uncertainty, so next is filling in the gaps and double checking myself, then checking with my friends if my understanding matches theirs.

I’m going to work the high-level goals from top to bottom, I think.

Feel free to reach out to me via Twitter or email. Besides my foray into cryptocurrencies, I’m a freelance Ruby software developer with a focus on back-end systems and an obsession with code quality. I’m also developing a smart RSS reader over at Infinity Feed (beta should go live soon-ish).

Experimenting with recording gameplay on Twitch and Youtube

Jan 20^th, 2015

I have been playing videogames for over 20 years as one of my primary hobbies. During the last few years, I have watched other people play games on Youtube and Twitch.tv. Watching Youtube videos of people playing games has become my primary means of evaluating whether to get a new game or not. Trailer videos and reviews often paint an inaccurate picture of what the game will be like. Watching someone play the game is the most honest way to see what a game really is like playing. Some streamers also happen to be very entertaining to watch.

The experience

This weekend I experimented with streaming my own Diablo III gameplay on my Twitch TV Channel. This was my first time recording and broadcasting. It was an interesting experience.

Streaming a game is definitely an different way to play a game. Having the microphone on and your gameplay recorded feels a little bit like giving a presentation. For me, this meant that I felt very nervous, awkward and self-conscious at first. It made me lose my train of thought a couple of times, etc. After a while it got more comfortable, so the side-effects of being nervous went away a little.

Another thing to get used to is multi-tasking between the game and commentating on what you’re doing.

After streaming with zero watchers, I wanted to watch my recordings to see how I did. Twitch kept pausing the video to buffer every 30 seconds, despite my bandwidth being more than sufficient. I figured it was a limitation of Twitch. Luckily, Twitch makes it easy to export streams to my Youtube channel, which does not have issues with bandwidth.

Tech

For my setup, I started with Nvidia Shadowplay (which comes as part of the video card drivers) to record my gameplay, downscale the footage from my 1440p screen to a 720p video and stream it straight to Twitch TV. I recorded my voice-over with my gaming headset while playing the game.

Shadowplay was a very easy and low-barrier way to get started. The default settings are very conservative, so the result looked horrible to my very spoiled eyes. I had to increase the output resolution and bitrate to make it look a little bit better, but the Twitch stream output was nothing like watching Youtube videos in the same 720p resolution.

Looking for a better way to record and broadcast my video, I followed Twitch’s recommendation of Open Broadcaster Software (OBS). They have a good guide on how to configure it.

The visual quality went up, but I kept pushing too close against the maximum bitrate of 3500 kbit in order to make the stream look as good as possible. In fact, 3500 kbit is not enough to make the video look decent to me. Twitch starts getting buffering issues above 3000 kbit, but looked noticeably worse when I lowered my bitrate that far.

So: I was not happy with the quality of my stream on Twitch, yet my plan was to also get the footage on Youtube for later watching. Having zero subscribers to your (new) streaming channel means nobody sees it anyway. So I decided to skip Twitch and upload straight to Youtube. Uploading directly to Youtube allows me to record higher content at a (much) higher bitrate. Heck, I could upload a 1440p video in a crazy high bandwidth.

To determine what bitrate would be possible, what would be good and what would be a good baseline, I recorded a couple of short clips at different bitrates.

20mbit 1440p video looked fabulous, but my machine started making noticeably more noise by spinning up fans to cool itself. I’ve got quite a powerful setup, which I’ve tuned to make as little noise as needed even during reasonably high load. When it starts to make noise, I know I’m pushing against limits.
10mbit 1440p looked very good. It did not cause noisy fans.
8mbit 1440p still looked very good. In fact, I could not distinguish it from the 10mbit video. This is the bitrate I went with for recording my gameplay.

The Youtube video processing process is also interesting to look at. You upload the video, then they start processing it. Once this is done, the lowest quality version of the video (360p) is available. Invisibly in the background they keep processing increasingly higher quality versions (480p, 720p, 1080p and finally 1440p). Once the video list shows (HD) after the title’s name, the 720p version is available. This background process will take a couple of hours, depending on the length of the video.

I settled on uploading videos at night as unlisted, then making them public the next morning.

Gameplay

Oh, yes. I was recording playing a game.

I was playing Diablo III: Reaper of Souls, which has been out for over a year now. Last week Blizzard released the 2.1.2 patch in preparation for the Season being reset in early February.

At the launch of Season 1, I briefly tried it. At the time there were ways to exploit the game to get a maximum level character in two hours. For me this took the fun out of it. How can you compete on a ladder of sorts if you’re up against cheaters?

Blizzard hotfixed the ways to cheat out reasonably soon, but I had given up on it at that point. Now that a new season is coming up and it looks like people will actually be racing to reach the top, I’m interested in participating again.

So to see what my chances would be, I created a brand new hardcore character in Season 1. Hardcore characters stay dead once they get killed in game. You only live once. I like the added challenge of keeping a character alive.

The first night I played 6 hours and reached level 50 by playing the game in adventure mode. The second night I played about 3 hours to reach level 60. The third night I played another 3 or so hours to reach level 70 (the maximum) and reach my goal. Total time taken: 12.5 hours.

The key is to not push the difficulty level to extreme from the get-go, and to play in Adventure mode instead of the campaign. You have a more focused leveling experience and it’s nice not having to go through the low level parts of the campaign yet again. The gameplay is dumbed down quite a bit because this part serves as a tutorial of sorts. In adventure mode there is no tutorial, because you only unlock it after completing the campaign once.

Boolean Externalities

Sep 19^th, 2014

This is motivated/inspired by Avdi Grimm’s post on Boolean Externalties

http://devblog.avdi.org/2014/09/17/boolean-externalities/

In his post he asks the question: if a predicate returns false, why does it do so? If you chain a lot of predicates, it’s hard to figure out why you get the answer you get.

Consider this example. It implements simple chained predicate logic to determine if the object is scary.

class SimpleBoo
  def scary?
    ghost? || zombie?
  end

  def ghost?
    !alive? && regrets?
  end

  def zombie?
    !alive? && hungry_for_brains?
  end

  def alive?
    false
  end

  def regrets?
    false
  end

  def hungry_for_brains?
    false
  end
end

Following the chain of logic, something is scary if it’s either a ghost or a zombie. They are both not alive, but a ghost has regrets and a zombie is hungry for brains. This is the code as I would probably write it for a production app. It’s simple and very easy to read.

The downside is that if you want to know why something is scary, you have to go and read the code. You can not ask the object why it arrived at its conclusion.

Why

The following is a logical next step in the evolution of the code: I have modified the code so it can explain why a predicate returns true or false, though there is a tremendous “cost” in length and legibility.

class WhyNotBoo
  # The object is scary if there is a reason for it to be scary.
  def scary?
    why_scary.any?
  end

  # Why is this object scary?
  def why_scary
    reasons = []

    # Early termination if this object is *not* scary.
    return reasons unless ghost? || zombie?

    # Recursively determine why this object is scary.
    reasons.concat([:ghost => why_ghost]) if ghost?
    reasons.concat([:zombie => why_zombie]) if zombie?
    reasons
  end

  # For the "why not" question we re-implement the "why" logic in reverse.
  def why_not_scary
    reasons = []
    return reasons if ghost? || zombie?
    reasons.concat([:not_ghost => why_not_ghost]) unless ghost?
    reasons.concat([:not_zombie => why_not_zombie]) unless zombie?
    reasons
  end

  def ghost?
    why_ghost.any?
  end

  def why_ghost
    return [] unless !alive? && regrets?

    [:not_alive, :regrets]
  end

  def why_not_ghost
    reasons = []
    return reasons if ghost?

    reasons << :alive if alive?
    reasons << :no_regrets unless regrets?
    reasons
  end

  def zombie?
    why_zombie.any?
  end

  def why_zombie
    return [] unless !alive? && hungry_for_brains?

    [:not_alive, :hungry_for_brains]
  end

  def why_not_zombie
    reasons = []
    return reasons if zombie?

    reasons << :alive if alive?
    reasons << :not_hungry_for_brains unless hungry_for_brains?
    reasons
  end

  def alive?
    true
  end

  def regrets?
    false
  end

  def hungry_for_brains?
    false
  end
end

Yes, that’s a lot more code. All composite predicates have a “why_[predicate]” and a “why_not_[predicate]” version. Now you can ask if something is scary and why (or why not).

There are a few problems with this approach:

The logic is not in scary?, where you would expect it.
The logic is duplicated between why_scary and why_not_scary. Don’t Repeat Yourself, or you will get logic bugs.
There is a lot more code. A lot of boilerplate code, but also multiple concerns in the same method: bookkeeping and actual logic.

Cleaner code

Let’s see if we can make the code legible again, while preserving the functionality of “why” and “why not”.

class ReasonBoo < EitherAll
  def scary?
    either :ghost, :zombie
  end

  def ghost?
    all :not_alive, :regrets
  end

  def zombie?
    all :not_alive, :hungry_for_brains
  end

  def alive?
    false
  end

  def regrets?
    false
  end

  def hungry_for_brains?
    false
  end
end

So far, so good. The code is very legibile, but there is a mysterious superclass EitherAnd. Before we look at how it works, let’s look at what it allows us to do:

boo = ReasonBoo.new
boo.scary? # => false
boo.why_scary # => []
boo.why_not_scary # => [{:not_ghost=>[:not_regrets]}, {:not_zombie=>[:not_hungry_for_brains]}]

boo.ghost? # => false
boo.why_ghost # => []
boo.why_not_ghost # => [:not_regrets]

boo.zombie? # => false
boo.why_zombie # => []
boo.why_not_zombie # => [:not_hungry_for_brains]

For each predicate that uses either or all we can ask why or why not it’s true and the response is a chain of predicate checks.

How we get cleaner code

If you want to make your code legible, there usually has to be some dirty plumbing code. In this example we have hidden this in a superclass, but it could have been a module as well without too much effort.

In order to keep the code easier to read, I have chosen to not extract duplicate logic into helper methods.

This class implements two methods: either and all.

Both methods have the same structure:

Setup the why_[predicate] and why_not_[predicate] methods.
Evaluate each predicate until we reach a termination condition.
Track which predicates were true/false to explain why we got the result we did.

class EitherAll
  # This method mimics the behavior of "||". These two lines are functionally equivalent:
  #
  #   ghost? || zombie? # => false
  #   either :ghost, :zombie # => false
  #
  # The bonus of `either` is that afterwards you can ask why or why not:
  #
  #   why_not_scary # => [{:not_ghost=>[:not_regrets]}, {:not_zombie=>[:not_hungry_for_brains]}]
  def either(*predicate_names)
    #
    # 1. Setup up the why_ and why_not_ methods
    #

    # Two arrays to track the why and why not reasons.
    why_reasons         = []
    why_not_reasons     = []

    # This is a ruby 2.0 feature that replaces having to regexp parse the `caller` array.
    # Our goal here is to determine the name of the method that called us.
    # In this example it is likely to be the `scary?` method.
    context_method_name = caller_locations(1, 1)[0].label

    # Strip the trailing question mark
    context             = context_method_name.sub(/\?$/, '').to_sym

    # Set instance variables for why and why not for the current context (calling method name).
    # In our example, this is going to be @why_scary and @why_not_scary.
    instance_variable_set("@why_#{context}", why_reasons)
    instance_variable_set("@why_not_#{context}", why_not_reasons)

    # Create reader methods for `why_scary` and `why_not_scary`.
    self.class.class_eval do
      attr_reader :"why_#{context}", :"why_not_#{context}"
    end

    #
    # 2. Evaluate each predicate until one returns true
    #

    predicate_names.each do |predicate_name|
      # Transform the given predicate name into a predicate method name.
      # We check if the predicate needs to be negated, to support not_<predicate>.
      predicate_name_string = predicate_name.to_s
      if predicate_name_string.start_with?('not_')
        negate                = true
        predicate_method_name = "#{predicate_name_string.sub(/^not_/, '')}?"
      else
        negate                = false
        predicate_method_name = "#{predicate_name_string}?"
      end

      # Evaluate the predicate
      if negate
        # Negate the return value of a negated predicate.
        # This simplifies the logic for our success case.
        # `value` is always true if it is what we ask for.
        value = !public_send(predicate_method_name)
      else
        value = public_send(predicate_method_name)
      end

      #
      # 3. Track which predicates were true/false to explain *why* we got the answer we did.
      #

      if value
        # We have a true value, so we found what we are looking for.

        # If possible, follow the chain of reasoning by asking why the predicate is true.
        if respond_to?("why_#{predicate_name}")
          why_reasons << { predicate_name => public_send("why_#{predicate_name}") }
        else
          why_reasons << predicate_name
        end

        # Because value is true, clear the reasons why we would not be.
        # They don't matter anymore.
        why_not_reasons.clear

        # To ensure lazy evaluation, we stop here.
        return true
      else
        # We have a false value, so we continue looking for a true predicate
        if negate
          # Our predicate negated, so we want to use the non-negated version.
          # In our example, if `alive?` were true, we are not a zombie because we are not "not alive".
          # Our check is for :not_alive, so the "why not" reason is :alive.
          negative_predicate_name = predicate_name_string.sub(/^not_/, '').to_sym
        else
          # Our predicate is not negated, so we need to use the negated predicate.
          # In our example, we are not scary because we are not a ghost (or a zombie).
          # Our check is for :scary, so the "why not" reason is :not_ghost.
          negative_predicate_name = "not_#{predicate_name_string}".to_sym
        end

        # If possible, follow the chain of reasoning by asking why the predicate is false.
        if respond_to?("why_#{negative_predicate_name}")
          why_not_reasons << { negative_predicate_name => public_send("why_#{negative_predicate_name}") }
        else
          why_not_reasons << negative_predicate_name
        end
      end
    end
    # We failed because we did not get a true value at all (which would have caused early termination).
    # Clear all positive reasons.
    why_reasons.clear

    # Explicitly return false to match style with the `return true` a few lines earlier.
    return false
  end

  # This method works very similar to `either`, which is defined above.
  # I'm only commenting on the differences here.
  #
  # This method mimics the behavior of "&&". These two lines are functionally equivalent:
  #
  #   !alive? && hungry_for_brains?
  #   all :not_alive, :hungry_for_brains
  def all(*predicate_names)
    context_method_name = caller_locations(1, 1)[0].label
    context             = context_method_name.sub(/\?$/, '').to_sym
    why_reasons         = []
    why_not_reasons     = []
    instance_variable_set("@why_#{context}", why_reasons)
    instance_variable_set("@why_not_#{context}", why_not_reasons)
    self.class.class_eval do
      attr_reader :"why_#{context}", :"why_not_#{context}"
    end

    predicate_names.each do |predicate_name|
      predicate_name_string = predicate_name.to_s
      if predicate_name_string.start_with?('not_')
        negate                = true
        predicate_method_name = "#{predicate_name_string.sub(/^not_/, '')}?"
      else
        negate                = false
        predicate_method_name = "#{predicate_name_string}?"
      end

      if negate
        value = !public_send(predicate_method_name)
      else
        value = public_send(predicate_method_name)
      end

      # The logic is the same as `either` until here. The difference is:
      #
      # * Either looks for the first true to declare success
      # * And looks for the first false to declare failure
      #
      # This means we have to reverse our logic.
      if value
        if respond_to?("why_#{predicate_name}")
          why_reasons << { predicate_name => public_send("why_#{predicate_name}") }
        else
          why_reasons << predicate_name
        end
      else
        if negate
          negative_predicate_name = predicate_name_string.sub(/^not_/, '').to_sym
        else
          negative_predicate_name = "not_#{predicate_name_string}".to_sym
        end

        if respond_to?("why_#{negative_predicate_name}")
          why_not_reasons << { negative_predicate_name => public_send("why_#{negative_predicate_name}") }
        else
          why_not_reasons << negative_predicate_name
        end

        why_reasons.clear
        return false
      end
    end

    why_not_reasons.clear
    return true
  end
end

Conclusion

It is possible to provide traceability for why a boolean returns its value with less than 200 lines of Ruby code and minor changes to your own code.

Despite the obvious edge cases and limitations, it’s nice to know there is a potential solution to the problem of not knowing why a method returns true or false.

New blog and a status update

Jan 10^th, 2012

During Christmas I moved my blog from Blogspot to blog.narnach.com using Octopress.

Blogspot served me well at first, but after a while I got tired of fighting with it. Having to write your posts in raw HTML to make them look decent is so 20th century. Because Octopress is Jekyll with all kinds of nice stuff on top, I can now write my posts in Markdown and be happy again. Deployment is easy as well, since Octopress just generates static HTML and uses scp to put it on my server. I have to admit, it feels good to have complete control over my own blog on my own domain.

The last blog post I imported was written a long time ago, in 2009, while I was still working for yoMedia. Oh boy how things have changed since then.

Since 1 September 2009 I am the proud owner of my one-man company, Narnach. My original plan was to split my time 50/50 between building my own projects and freelancing as a software developer. It took a month for the first freelance opportunity to present itself. While working with that client, another opportunity came along and I’ve been busy non-stop ever since. In all the freelance fun I completely forgot about doing my own projects, though. Funny how that goes.

2010 was a good year. For most of my projects, I worked together with Gerard de Brieder of Govannon as a two-men freelance army and that collaboration worked out well. In 2011, Gerard got himself a physical office that we moved into. Soon, Gokhan Arli of Sylow joined us as third man on the team.

2011 has been my busiest year yet. It went so well that for the first half year I averaged 60 hour weeks just to be able to keep up with all the work. The second half of the year I finally learned to say “NO” to new opportunities, so things quieted down a bit to “regular” 35 hour weeks. Still, I worked way more than I should have and in December I was displaying symptoms of what, according to Wikipedia, may be a Burnout. So I went on vacation on 20 December and I took almost three weeks off to relax and recover. It seems to have helped!

For 2012 I have decided to change two things.

First, I’m slowing down my freelance activities. This means taking on less new clients and letting some existing clients go. The result should be less stress and more energy to work for the clients I am keeping.

Second, I’m revisiting my original plan of developing my own project. Freelancing is fun, but you are always helping other people grow their company. I finally have an idea that I would like to develop, so I figure now is the best time to start. The change from my original plan is that this will strictly be an evening/weekend thing. “They” say it takes at least a year or two to achieve “overnight success”, so I figure I’d not quit my day job as a freelancer just yet.

A possible future for web-based communication

Jun 24^th, 2009

My recent post on the Kings of code side event got too long, so I extracted the following into its own blog post. It is a collection of thoughts on a possible presentation topic.

People like to communicate with each other. Centuries ago we wrote letters and sent them with the merchants, hoping they would arive. The telegraph was a revolution: we could send a message faster, cheaper and with more certainty of delivery. The telephone was even more revolutionary: direct communication over a long distance.

In the age of the internet, change happens even faster. E-mail has been around for “ages”, just like various forms of chat services have come and gone.

Broadcasting has gone through a similar change. It started with spoken announcements; proclamations from the king. At some point there were pamphlets and posters. Books can be seen as a way to broadcast a message. Newspapers are a periodic form of broadcast. Radio enabled long-distance audio broadcasting without the cost of creating a physical carrier for the message. Television added moving images to radio.

Now we have the internet, where we go through similar stages.

Static, web 1.0, websites are pamplets re-invented. Old concepts in an electronic shape. Ebooks are electronic books. Newspapers try to put their content on their own websites, updating them daily to bring their news to the masses. E-mail newsletters just scream “newspaper” to me. Radio can be found as streaming audio.

As we have become more familiar with the internet and with an increased access through broadband, cable and fiber, we have started to innovate with the new medium. RSS changed the direction of broadcast from push to pull. youTube may have started as a way to share existing videos, but it has since grown to a place where anyone can make themselves heard. That is many to many communication.

The internet made interactivity a lot easier than in the off-line world. Web forums and UseNet allow groups of users to interact with each other through written messages. Blogs allow everyone to have their own newspaper column. Instead of an opinions page in the newspaper that is the internet, every writer gets their own column and they all respond and refer to each other’s writing.

Twitter is the latest thing. It is a hybrid between instant messaging, e-mail and RSS feeds. People say it does not scale, yet the Twitter engineers keep making it better and more and more people are able to use it. Is there a limit to which it can scale? Is its centralized server model not going to be an important limitation on both scale and freedom later on? When there are six billion people using one and the same service, relying on it for an important part of their daily communications, how much can you trust on one company to take care of it?

Would it not be better to turn it into a distributed service? It has sucessfully scaled e-mail, Jabber/XMPP, the telephone network and internet itself. E-mail, as a world-wide service, has never gone down, even if individual servers might go down from time to time.

Traditional media has its problems. Paper flyers, mail-delivered advertising and commercials on the radio and television are a few examples. The relatively high price of print media or traditional broadcasting limits the amount of these forms of advertising. This makes it somewhat bearable. In contrast, spam via e-mail, blog comments, web forum posts and instant messages have no such limitations. It’s virtually free to broadcast your message to a million people, so it happens a lot and people really don’t like it.

E-mail spam can happen because there is a near-zero cost or risk for the sender. The same goes for other on-line communication. It can be interesting to consider the friend-of-a-friend model, such as is seen on Linked In and other social networks?

To send a message to someone, the whole connection chain between sender and receiver must be known, no matter how long it is. If someone spams, it means there is a chain of real people connecting them to you. This means the sender is traceable instead of anonymous. If you flag a message as spam, the whole chain is notified that they were part of a spam chain. This means people can choose to ban the spammer from using them as a connection in sending a message. It also means you can identify people who act as a gateway for spammers to send messages to other people.

The nice thing about treating communication as a social network activity is that it makes people more aware that they are dealing with people. By taking anonymity out of the equation, it lowers the tendency of people to act like a Total Fuckwad when they think nobody is watching them.

Are there other ways to look at communication, to turn it upside-down and re-investigate how it works? What is going to be the next Twitter?

A possible future for package management

Jun 22^nd, 2009

My recent post on the Kings of code side event got too long, so I extracted the following into its own blog post. It is a collection of thoughts on a possible presentation topic.

RubyGems has been around for ages and has made it relatively easy to distribute Ruby code. Not everyone uses it, though. Some prefer to use the Debian package manager, or whatever their OS provides, instead. This is very useful if a gem has external dependencies, but it is not as portable as RubyGems.

RIP was recently released (well it is only version 0.0.1, but still) as something to use complementary to RubyGems. It does not allow relative version requirements (<, <=, >=, >) for dependencies, only exact version requirements. It borrows the concept of virtual environments from the Python world. A different approach to package management out in the wild means people will gain new insights. What can we learn here? Where lies the right balance between having rigid, version specific, dependencies and open-ended dependencies?

Thinking along the dependency management line, why do we require exact versions or do we put an upper limit on accepted versions? The only reason I can think of is incompatibilities introduced in later versions, but is it right at all to introduce backward incompatibilities in your API? Can’t we learn something from functional programming here?

In FP, pure functions don’t have side effects. One of the implications is that the data they receive does not get altered. You don’t add a new item to an existing array; you return a new array with the new item appended to the existing array. Because of this, there is no problem when you have a multi-threaded program: there is no risk that two threads will try to modify a shared resource at the same time.

This means you don’t need mutexes to lock an object to one thread while it manipulates the object. No mutexes means no deadlocks or other headaches associated with threading.

What was I talking about? Ah yes, dependencies and how they relate to functional programming. Explicit version dependencies can be seen as mutexes: only one version is allowed to be used at once. Two versions of a library can not be loaded at the same time. This is good if the two versions are incompatible. It is bad if the newer version only adds new functionality to the library.

What if you would build your library in a way that resembles the pure functions of functional programming? No side effects in this case means there are no nasty surprises when upgrading. If your program works with version 1 of the library, it will work without changes with version 1000. Existing functionality is immutable.

To make this work, new versions should only introduce new behaviour, they can not change old behaviour. I think making bugfixes would be ok, but performance enhancements are not, as it might be that you introduce negative side effects in some edge cases, thereby breaking someone’s app. Maybe fixing bugs will actually break someone’s app if they depended on buggy behaviour. Hmmm….

This makes dependency management rather easy. You set a minimum version requirement for the libraries you use and you can just upgrade the libraries to newer versions as they become available. New applications that use new features can co-exist with old applications that use old features from the same library.

Under this model, if you plan to radically re-architect a project, you could fork it and release it under a new name. Rails v1., Rails2 v1, Rails3 v1. A downside is that forks can have a large shared codebase, but there will no longer be conflicts between versions of one project.

Has anyone ever explored the possibilities of library development along these lines? Did it work or were there problems that I have overlooked? What good features of the ‘current’ systems would you lose?

Kings of Code side-event: Amsterdam.rb unconference

Jun 21^st, 2009

The Kings of Code (KoC) conference is going to be held on 30 June 2009. The day before, monday 29 June, is side-event day. Sander van der Vliet, the KoC organizer, posted a message to the Amsterdam.rb mailing list last wednesday to ask if we were interested in organizing a side-event on the 29th.

After only a couple of us replied to Sander’s e-mail, we knew none of the ~~heroes~~ usual people would step forward to organize this. On friday, Julio Javier Cichelli (@monsieur_rock) sent me a direct message on Twitter about my thoughts on how the unconference should be organized. From there, we discussed how to get it organized, how to get speakers and what we were going to present. In short, we stepped forward to organize the side-event.

Next I tweeted to ask for presenters. The message got re-tweeted a number of times and almost within minutes there is feedback from multiple people willing to do a presentation. At the end of the day we have 6 people willing to speak. Using Twitter to organize something is a really quick and powerful way to do it.

As more and more people indicate they are willing to speak (thank you all!) the focus moves from finding speakers to handling the details of making it all work. At what time do we start? How many hours do we have? Is there wifi? Is there a beamer? The further you go, the more you discover there are things you should find out or arrange.

Next week we need to find a sponsor for the venue and we need to start thinking about the things that need to be done on the event day itself. We also need to confirm time and location with all presenters and announce the side-event.

Unconference

The side-event is an unconference. I have never been to one, so I can only go by what is on the internet. A characteristic of unconferences is that there is no fixed agenda. There are no time slots. It is not about one person being an expert and bestowing wisdom upon the attendees, but about the attendees sharing wisdom with each other. I like that.

Everybody knows something other people can benefit from, so the more opportunities there are for everyone to contribute, the more everyone will learn. Any one of the attendees can decide on the spot they want to talk about something, show code or sing a song. I hope people will do this.

The Devnology meetings have impressed upon me the importance of interactivity at a gathering of people, so I hope we can give the unconference an interactive twist.

After a presenter is done speaking, we’ll try to get a group discussion started on the topic. Once the discussion starts to fade, or starts to run in circles, we can ask for the next speaker to get on stage and introduce a next topic.

After the last speaker, we can try to spark group discussions by encouraging people to approach the speakers and ask them questions. This, in turn, can create a number of smaller discussions, with the speakers being the center of interactivity. It’s a great way to get to know new people.

Between 6 more or less confirmed speakers, group discussions, short breaks and (I hope) spontaneous speakers, it looks like we will actually fill up the 5 hours we have available to us.

Finding a theme

Unconferences tend to have a theme, this is so people can prepare themselves and to have some form of coherence between talks. This is trickier, as I did not really think about this until now.

Here is a list of topics that people have expressed they want to talk about:

CouchDB (or an introduction to Erlang)
Communicative Programming with Ruby
Code reviews
Using Rails for Location based search
Short and Sweet II
MacRuby, RESTful web services and other cool things

If I do a bit of creative extrapolating, one topic that can be extracted from this is “The Future of Web Development (using Ruby)”. Let me explain by briefly looking at each topic:

CouchDB is a possible future of databases. It’s not relational, so it has different scaling needs compared to ‘traditional’ relational databases.
RESTful services are the next big thing. Within the Ruby/Rails world it’s becoming a de-facto standard on how to design a web service. The rest of the webdev world seems to be following along here.
If you look at the last decade or two and how the dominant languages have changed, it becomes apparent that code is getting way more readable. Shorter, leaner code is more readable because there is just less code. People use more expressive languages that can do more with less code. Code has become more communicative (at least in Ruby) because of the focus on good conventions like intention revealing naming. DSLs are another good example of readability. If a non-programmer can read your code, you know it is readable.
Alternative Ruby implementations are a way into the future for the language. Diversity allows different ideas to be explored at the same time. The same goes for alternative web frameworks. They are a breeding ground for innovation, which is what you need to get a future that is different from the present.
Code reviews are a way to ensure that code written in the past is actually good enough to be kept around in the future.
Location-based search has a futuristic sound to it, so it fits the theme.

Thoughts on The Future of Web Development (using Ruby)

The Future is an interesting topic that can be applied in a lot of ways. Here’s a number of ideas for presentations:

The future of server administration

With VPSes and cloud computing becoming available everywhere, is there still a need to own your server hardware? With services like Heroku, Github webpages and Disqus, do you still need to even know how to install Ruby or how to configure Apache?

Even if you don’t use these services, by using tools like Capistrano, Ubuntu Machine, Deprec or Rudy, you can still simplify deployment and server management. Does simplifying these things bring new opportunities? What does a sysadmin do with the time saved by these tools and services? Are there new possibilities opened by freeing up sysadmin time? What are they?

The future of package management

I have addressed this in a separate blog post.

The future of web-based communication

I have addressed this in a separate blog post.

My personal experience so far

Between asking Sander for info (via Twitter, of course), discussing things with Julio and discussing details with speakers, there is quite some communication going on. It’s exciting and a little scary at the same time.

I’m a natural introvert, so I tend to avoid communication when I can get away with it. It’s not often that I approach other people first. Taking an active role in helping to organize an unconference like this is therefore quite a bit outside of my comfort zone.

So why the heck am I doing this? One reason is because I want to see it happen. If nobody does it for you, do it yourself. The other reason is is that I want to expand my comfort zone. Someone wise once wrote: “If something does not scare you at least a little bit, it is not worth doing.” It might sound be a bit extreme, but the core idea is valuable nonetheless: a very good way to learn things is to do the things that scare you.

Helping to organize a side-event for KoC will most definitely be a great learning experience.

← Older Blog Archives