torstai 9. marraskuuta 2017

Complex systems, root cause analysis and failure

I just read http://www.michaelnygard.com/blog/2017/11/root-cause-analysis-as-storytelling/ and it reminded me about classic "How Complex Systems Fail" ( http://web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf ) .

We are building complex systems all the time, and it's actually scary how many defenses against failure are built into them. These defenses can be as simple as checking return value of function, or more complex with fallbacks and alternative implementations. They aren't scary because they are there; they are scare when you think that if even one of those defenses is missing, things go bad pretty quickly.

Currently humans are still superior in defending these systems. They make workarounds and processes that avoid potential failures. It might be really interesting to apply machine learning in these situations, trying to find out the sets of actions that lead to failures.

But meanwhile, we have to learn from our systems by ourselves, so try to avoid hunting that one root cause.

keskiviikko 20. syyskuuta 2017

Amazon Cloudformation and tagging

AWS Cloudformation has multiple different commands in aws cli, like "create-stack", "update-stack" and "deploy". Each of these have their good and bad sides. For multiple reasons, we've decided to use "deploy". But the problem then becomes tagging. "Create-stack" and "update-stack" both have support for giving tags which are then propagated to all supported resources, but deploy does not have it. To make things worse, some Cloudformation types does not support tags as their properties, but they seem to get tags from Cloudformation stack if tags are there.

Now we do after deploy "aws cloudformation update-stack --stack-name <some> --tags ...". This becomes quite easy with some scripting when you have jq!



As update-stack wants to have all parameters with "UsePreviousValue=true", use some jq to generate necessary parameters. Then we take existing Parameters we've used for tagging and generate tags from that.

Well, actually "quite easy" is a lie, as I had some problems in understanding right syntax to replace key in JSON array with jq.

tiistai 19. syyskuuta 2017

Docker, Alpine and dillon's cron: "setpgid: Operation not permitted"

For a while, I've been strugling to get dillon's cron working properly in Docker container. The problem has been that when the ENTRYPOINT was anything else than in shell form, I got 'setpgid: Operation not permitted'.

So, this worked:
ENTRYPOINT /usr/sbin/crond -f
None of these seemed to work:
ENTRYPOINT ["/usr/sbin/crond", "-f"]
Or
ENTRYPOINT ["./entrypoint.sh"]
CMD ["/usr/sbin/crond", "-f"]
 As both would give
setpgid: Operation not permitted
But using shell form has been enough, for now. Now as I finally needed to have entrypoint for doing some preparation work, something had to be done.

"su -c" to the rescue.

ENTRYPOINT ["./entrypoint.sh"]
CMD ["su", "-c", "/usr/sbin/crond -f"]
Seems to be working perfectly.