<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Alan's amusing and surprising homepage]]></title><description><![CDATA[Finding problems, solving problems, writing software.]]></description><link>https://www.franzoni.eu/</link><image><url>https://www.franzoni.eu/favicon.png</url><title>Alan&apos;s amusing and surprising homepage</title><link>https://www.franzoni.eu/</link></image><generator>Ghost 3.11</generator><lastBuildDate>Wed, 30 Dec 2020 02:53:51 GMT</lastBuildDate><atom:link href="https://www.franzoni.eu/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[The curse of the downvote]]></title><description><![CDATA[<p>I don't have strong opinions on Facebook - I'm not even a user anymore - but I think that the "like/dislike" mania is going a bit too far. I've read yesterday that an engineer <a href="https://www.cnbc.com/2019/09/19/ex-facebook-engineer-patrick-shyu-makes-fun-of-company-on-youtube.html">was fired from Facebook for having a YouTube channel</a>, but that's beyond the scope of</p>]]></description><link>https://www.franzoni.eu/the-curse-of-the-downvote/</link><guid isPermaLink="false">5d860e2bbb3e470001598a54</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[rants]]></category><category><![CDATA[thoughts]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Sat, 21 Sep 2019 12:07:46 GMT</pubDate><media:content url="https://images.unsplash.com/photo-1551730459-92db2a308d6a?ixlib=rb-1.2.1&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb&amp;w=1080&amp;fit=max&amp;ixid=eyJhcHBfaWQiOjExNzczfQ" medium="image"/><content:encoded><![CDATA[<img src="https://images.unsplash.com/photo-1551730459-92db2a308d6a?ixlib=rb-1.2.1&q=80&fm=jpg&crop=entropy&cs=tinysrgb&w=1080&fit=max&ixid=eyJhcHBfaWQiOjExNzczfQ" alt="The curse of the downvote"><p>I don't have strong opinions on Facebook - I'm not even a user anymore - but I think that the "like/dislike" mania is going a bit too far. I've read yesterday that an engineer <a href="https://www.cnbc.com/2019/09/19/ex-facebook-engineer-patrick-shyu-makes-fun-of-company-on-youtube.html">was fired from Facebook for having a YouTube channel</a>, but that's beyond the scope of my post today. In one of his videos, he says that FB culture is driven by likes - you need to be popular, not to be good.</p><p>Well, I actually think that may be appropriate for Facebook. Eat your own dogfood - isn't that one of the golden rules for just any product?</p><p>But, what I hate nowadays is the curse of the downvote (or the upvote). It's everywhere: on Facebook, on Reddit, even on Hacker News - you get downvoted, then probably your comment is hidden from most people. The most upvoted comment (or article) gets more impressions.</p><p>But what does an upvote, or a like, or a share, or a retweet, or a downvote, or a flag, actually mean? I think we've lost the context.</p><p>When I still was on Facebook, and some of my friends shared an hoax or just some falsehood, I was usually quick to point that out - with due references. Then they just told me "hey, I shared the article. I'm not the one who wrote that. I may not even agree with that". So... what does "share" mean?</p><p>I think we need better connections for our actions. What's an upvote (or a retweet or a reshare) ? Does it mean "I agree"? Does it mean "I have reasons to believe the poster is factually correct"? Does it mean "I like it for no particular reason"? </p><p>What's a downvote? Does it mean "I think the poster is factually incorrect"? Or "I don't like this opinion"? </p><p>What's a flag (in HN) or a report in other sites ? Does it mean "The poster suggests something illegal"? Does it mean "I think it's an hoax, fake news, just trolling"? Or it just means "I wouldn't like to see it there, throw it away". What does that REALLY imply?</p><p>Yes, maybe that's complex. "Upvote" or "Downvote" seems easier. <strong>But that's not just our opinion anymore. </strong>Upvotes and downvotes do shape conversations and discussions, and using them unwisely just perpetrates echo chambers and unhealthy ideological silos. </p><p>I think we can - <strong>and we should - </strong>do better. Much better. </p><p>Maybe we should show comments / articles in a random fashion, and let people "grade" them by quality of the opinion and/or correctness. And only after a while, when commenting stops, we should create a "top ten" for our comments; or something like that; we should make sure that more than one idea gets exposure, rather than going from the start with a "winner takes it all" mentality. It's not good for anybody.</p><p>Photo by <a href="https://unsplash.com/@fbngsk?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Fabian Gieske</a> on <a href="https://unsplash.com/s/photos/agree?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></p>]]></content:encoded></item><item><title><![CDATA[Machine Learning: a sound primer]]></title><description><![CDATA[How to start with machine learning? Some serious, yet practical, suggestions.]]></description><link>https://www.franzoni.eu/machine-learning-a-sound-primer/</link><guid isPermaLink="false">5d414682b35e4300014608a8</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[machinelearning]]></category><category><![CDATA[statistics]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Wed, 31 Jul 2019 08:40:54 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2019/07/lacie-slezak-yHG6llFLjS0-unsplash-1-.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://www.franzoni.eu/content/images/2019/07/lacie-slezak-yHG6llFLjS0-unsplash-1-.jpg" alt="Machine Learning: a sound primer"><p>I see many people who would like to take a glimpse at machine learning, and try to understand a bit how it works. Very often, they can either get pre-baked examples with very specific (and possibly too advanced) approaches - like deep learning - or math-oriented explanations that can be dry or just uninteresting.</p><p>I recently discovered a rather famous free textbook that I hadn't touched before: <a href="http://faculty.marshall.usc.edu/gareth-james/ISL/">An Introduction to Statistical Learning</a> . As you may infer by the non-glamorous title, that's a book that doesn't try to sell you something <em>fancy</em> about machine learning. It's quite a practical and non math-heavy introduction to most useful machine learning topics, which will lead the reader to develop an intuition for what ML methods do. SPOILER: neural networks aren't covered! So, if you're just running after the hype, that's not the book for you.</p><p>The only real drawback from the original book is that most examples and demos are coded in R. I don't especially like the language, as it is highly specialized and, most probably, you'll need to know another language beyond it for general-purpose processing.</p><p>So, I'm happy to link a couple of repositories that offer most examples from the book, but coded in Python; those should be more accessible to most people, as the language is very widespread:</p><p><a href="https://github.com/tdpetrou/Machine-Learning-Books-With-Python/tree/master/Introduction%20to%20Statistical%20Learning">https://github.com/tdpetrou/Machine-Learning-Books-With-Python/tree/master/Introduction%20to%20Statistical%20Learning</a></p><p><a href="https://github.com/JWarmenhoven/ISLR-python">https://github.com/JWarmenhoven/ISLR-python</a></p><p>Happy StatLearning!</p><p>EDIT:</p><p>There's a MOOC as well covering most of the topics from the book, by the same original authors: <a href="https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/about">https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/about</a></p><p><em>Photo by <a href="https://unsplash.com/@nbb_photos?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Lacie Slezak</a> on <a href="https://unsplash.com/search/photos/learning?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></em></p>]]></content:encoded></item><item><title><![CDATA[Standalone, single-file, editable Python scripts WITH DEPENDENCIES]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>How badly I wanted something like that?</p>
<h3 id="theproblempythonforscripting">The problem: Python for scripting</h3>
<p>Beside programming and data science, I find Python to be a very useful glue language; I think it's great for shell replacement when bash/zsh scripts get too complex, but there's one caveat: as long as you can</p>]]></description><link>https://www.franzoni.eu/single-file-editable-python-scripts-with-dependencies/</link><guid isPermaLink="false">5a992d9bd483630001e24095</guid><category><![CDATA[Mostly Unixish]]></category><category><![CDATA[python]]></category><category><![CDATA[packaging]]></category><category><![CDATA[delivery]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Tue, 19 Feb 2019 20:58:00 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2019/02/milan-popovic-674483-unsplash-1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2019/02/milan-popovic-674483-unsplash-1-.jpg" alt="Standalone, single-file, editable Python scripts WITH DEPENDENCIES"><p>How badly I wanted something like that?</p>
<h3 id="theproblempythonforscripting">The problem: Python for scripting</h3>
<p>Beside programming and data science, I find Python to be a very useful glue language; I think it's great for shell replacement when bash/zsh scripts get too complex, but there's one caveat: as long as you can work with its standard library, you're in the sweet spot. As soon as you'd like to use an external dependency, that can be a problem, because if you don't want to contaminate your system with external dependencies, you'll either a) hope that your system packages a proper version for such library or b) start needing <a href="https://virtualenv.pypa.io/en/latest/">virtualenv</a> and so on.</p>
<p>Both options are ok for manual development, a bit less ok if you're willing to deliver such scripts to multiple servers for automating some kind of process.</p>
<p>For sure, there're many options to fully package a Python executable - <a href="https://www.pyinstaller.org/">PyInstaller</a> comes to my mind, but other exist. But then you've got a kind of &quot;build process&quot; for your script, and you cannot edit it directly on a server. But I find that, very often, for internal tasks and scripts, my process is exactly that: <strong>I do edit the script on the server, then, when I get it right, I copy it on my version control system and deliver it to other machines.</strong> Yes, I wouldn't do the same for &quot;real&quot; software, but as I said, those are often internal scripts, used for reporting, cron jobs, other small automated tasks.</p>
<h3 id="thesolutioneditablepythonscriptswithisolateddependencies">The solution: editable python scripts with isolated dependencies</h3>
<p>So what? That's what I baked. Not a perfect solution, but a decent one. Just have <em>python</em> and <em>pip</em> on your system, add a REQUIREMENTS string (equivalent to the content from requirements.txt), then import everything.</p>
<p>This will install the dependencies in <strong>separate</strong> location in a temporary directory at first use, then reuse them when necessary.</p>
<p>So: just copy &amp; paste the following snippets, edit the two <code>USER SERVICEABLE</code> sections, then start writing your desired code at the bottom. The snippet here includes an example of how to run <code>requests</code>, so you can just delete the requirements, imports and requests call at the bottom if you don't need that.</p>
<pre><code class="language-python">#!/usr/bin/python3
import os
import sys
from tempfile import gettempdir, NamedTemporaryFile
import hashlib

# USER SERVICEABLE: paste here your requirements.txt
# the recommendation is to create a development virtualenv,
# install the deps with pip inside it, then do a `pip freeze`
# and paste the output here
REQUIREMENTS = &quot;&quot;&quot;
certifi==2018.11.29
chardet==3.0.4
idna==2.8
requests==2.21.0
urllib3==1.24.1
&quot;&quot;&quot;
# USER SERVICEABLE end

def add_custom_site_packages_directory(raise_if_failure=True):
    digest = hashlib.sha256(REQUIREMENTS.encode(&quot;utf8&quot;)).hexdigest()
    dep_root = os.path.join(gettempdir(), &quot;pyallinone_{}&quot;.format(digest))
    os.makedirs(dep_root, exist_ok=True)

    for dirpath, dirnames, filenames in os.walk(dep_root):
        if dirpath.endswith(os.path.sep + &quot;site-packages&quot;):
            # that's our dir!
            sys.path.insert(0, os.path.abspath(dirpath))
            return dep_root

    if raise_if_failure:
        raise ValueError(&quot;could not find our site-packages dir&quot;)

    return dep_root

dep_root = add_custom_site_packages_directory(False)

deps_installed = False

while True:
    try:
        # USER SERVICEABLE: import all your required deps in this block! and keep the break at the end!
        import requests
        # USER SERVICEABLE end

        break
    except ImportError:
        if deps_installed:
            raise ValueError(&quot;Something was broken, could not install dependencies&quot;)
        try:
            from pip import main as pipmain
        except ImportError:
            from pip._internal import main as pipmain

        with NamedTemporaryFile() as req:
            req.write(REQUIREMENTS.encode(&quot;utf-8&quot;))
            req.flush()
            pipmain([&quot;install&quot;, &quot;--prefix&quot;, dep_root, &quot;--upgrade&quot;, &quot;--no-cache-dir&quot;, &quot;--no-deps&quot;, &quot;-r&quot;, req.name])

        add_custom_site_packages_directory()
        deps_installed = True

# HERE you can start writing the actual code of your script

r = requests.get(&quot;https://www.google.com&quot;)
print(r.status_code)
</code></pre>
<h3 id="howdoesthiswork">How does this work?</h3>
<ul>
<li>It creates a subdir in the directory where temporary files are held on your filesystem, and downloads the dependencies there using <code>pip</code>. Such subdir name is autogenerated depending on your requirements - so, if your requirements change, a new directory is employed.</li>
<li>Then, it adds such directory to your <code>sys.path</code>, allowing Python to find modules and packages there.</li>
<li>When you restart the script, it first tries to find the libraries that were previously downloaded, and only if that fails it goes downloading the libraries again.</li>
</ul>
<h3 id="caveats">CAVEATS:</h3>
<ul>
<li>Of course the target system must have internet access, at least to pypi or github or other vcs (depending on your requirements format), and python and pip must be installed.</li>
<li>If your requirements change, nothing deletes files in your temp directory. But that's usually sweeped at system boot or by cronjobs, so it's not a real problem. BUT: if your cronjob only partially sweeps files in the subdir, it could break something (check if anything like that exists on your system, I can remember some older Redhat/Centos doing that).</li>
<li>If your packages have binary dependencies and/or require to build extensions, you still need the shared libs (for runtime) AND the proper header files / -dev packages. There's no silver bullet for that in this recipe.</li>
</ul>
<p>Photo by <a href="https://unsplash.com/@itsmiki5?utm_medium=referral&amp;utm_campaign=photographer-credit&amp;utm_content=creditBadge">Milan Popovic</a> on Unsplash</p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Application authors: please don't force users into your language or packaging details]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>This story has been boiling in my head since long; today I chose to (finally) publish it.</p>
<p>Long story short: <strong>in order to use a certain application, I should not need to understand how to use the language or its packaging ecosystem. Delivery and distribution is a relevant part of</strong></p>]]></description><link>https://www.franzoni.eu/software-authors-please-dont-force-me-into-your-packaging-woes/</link><guid isPermaLink="false">5c374d77a04f9c00019751aa</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[packaging]]></category><category><![CDATA[software]]></category><category><![CDATA[delivery]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Thu, 10 Jan 2019 17:10:00 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2019/01/malcolm-lightbody-1081282-unsplash.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2019/01/malcolm-lightbody-1081282-unsplash.jpg" alt="Application authors: please don't force users into your language or packaging details"><p>This story has been boiling in my head since long; today I chose to (finally) publish it.</p>
<p>Long story short: <strong>in order to use a certain application, I should not need to understand how to use the language or its packaging ecosystem. Delivery and distribution is a relevant part of your app.</strong></p>
<p>This does <strong>not</strong> apply to libraries, frameworks, or tools that are highly contextual for a certain language/environment ecosystem, and would be used <strong>only</strong> by a developer, in any case.</p>
<p>What do I mean? I'll start with an example. It's chronological, it's just the latest thing I came into; here's <a href="https://github.com/Backblaze/B2_Command_Line_Tool">b2</a>, the <strong>command line tool</strong> to get access to Backblaze B2 backup repositories. And those are its installation instructions:</p>
<p><img src="https://www.franzoni.eu/content/images/2019/01/GitHub_-_Backblaze_B2_Command_Line_Tool__The_command-line_tool_that_gives_easy_access_to_all_of_the_capabilities_of_B2_Cloud_Storage.png" alt="Application authors: please don't force users into your language or packaging details"></p>
<ul>
<li>What is pip?</li>
<li>Is such command safe to use?</li>
<li>Will it work in all situations?<br>
<em>answers:</em></li>
<li>Python's package manager. It makes the b2 library available, and such library has the so-called b2 script, which exposes the b2 CLI executable.</li>
<li>possibly safe, but will install b2 within your global/per-user python dependencies, and may alter or install additional packages as dependencies, and such deps may be picked by other, unrelated software in the system;</li>
<li>it may require root to be run correctly.</li>
</ul>
<p>Should every user of Backblaze B2 CLI know how to code in Python and understand how the Python environment works? I think that should <strong>not</strong> be necessary.</p>
<p>b2 is just an example, other famous tools - <a href="https://aws.amazon.com/cli/">awscli</a> and <a href="https://fpm.readthedocs.io/en/latest/installing.html">fpm</a> come to my mind - just follow suit.</p>
<p>Most probably, if the tool you're downloading is written in C or C++, you'll probably expect to find a compiled binary, at least for some OSes and architectures; The same applies for Go, which makes binary creation for multiple platforms exceptionally easy. <strong>You should strive to provide some kind of equivalent piece of software for your users.</strong> .</p>
<p>But, even for tools written in Python or JavaScript (they look to be the ones that suffer most from the problem I'm describing), in most situations, the &quot;right thing&quot; to do is to provide a standalone binary, <strong>regardless of how it's written.</strong> You can provide standalone binaries for some archs (this is what <a href="https://github.com/borgbackup/borg">borg</a> does - it's a great backup utility written in Python), or you can rely on native or add-on packaging (additional repositories for Linux, choco for Windows, homebrew for Mac), like <a href="https://httpie.org/">httpie</a>. And, when you create such binary or package, you should make sure it's totally standalone (i.e. it doesn't alter global system state).</p>
<p>Why? Because leveraging packaging tools that were originally meant for developers of a language is <strong>brittle and risky</strong>, and can be hard to understand. You may end up compromising the system integrity if the wrong dependency slips into globally installed packages - this is especially true for Python, which is often used by many, many system tools. And, if all third-party apps did it that way, we could have continuous breakages. <strong>You can't assume yours is the only application installed in a certain system!</strong></p>
<p>And, if you're writing an app, you can have <strong>full control</strong> of your execution environment and dependencies. You can choose a single Python or NodeJS version, and all of its dependencies. You don't need to support tons of variations!</p>
<p>In some situations, a CLI evolved out of a library. This is the case for <a href="http://pygments.org/">pygments</a>, a very nice syntax highlighting library. It's meant to be used as a library from inside python projects, but it also exposes a widely used <a href="http://pygments.org/docs/cmdline/">pygmentize</a> binary that is employed by many other tools. I'd love to have that CLI tool available as an independent, standalone package!</p>
<p>Final considerations:</p>
<ul>
<li>As a software author, think whether you're building a library, a framework, or an application. If you're writing an application, think how you can deliver it to your final users, without the need for them to understand how you've written it.</li>
<li>If you're a maintainer for a package, think about what you're maintaining. If something is both a library and a command line tool, consider creating multiple (possibly independent) packages.</li>
</ul>
<p>In the meantime:<br>
If you find a Python CLI tool that you'd like to use, and it uses the <em>pip way</em> and you don't to mess with system dependencies, I recommend <a href="https://github.com/mitsuhiko/pipsi">pipsi</a>. I don't know if something similar exists for Node or other environments.</p>
<p>FAQ:<br>
Q: But I'm a solo open source developer! Packaging this way would steal too much of my time!<br>
A: That's totally fine. Just make sure you don't suggest dangerous things to people. If you're using a developer-only approach, just state that as your target; but the tools I've written about above (b2 and awscli) certainly are of a different class.</p>
<p>Q: Any suggestion for easy packaging?<br>
A: I find <a href="https://brew.sh/">homebrew</a> very nice for Mac packaging and distribution. For Linux, I use <a href="https://github.com/jordansissel/fpm">fpm</a> with <a href="https://www.docker.com/">docker</a> - I've actually created a specific integration project <a href="https://github.com/alanfranz/fpm-within-docker">fpm-within-docker</a>. For Windows... no idea! I know about <a href="https://chocolatey.org/">chocolatey</a> and that's about it.</p>
<p>Photo by Malcolm Lightbody on <a href="https://unsplash.com/photos/401OD83Ke6o">Unsplash</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Misaligned Expectations: investigating the expectations gap]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>As some of my followers already know, I'm enrolled in the great Master's program at <a href="http://www.gatech.edu/">Georgia Tech</a>, the <a href="http://www.omscs.gatech.edu/">OMSCS</a>.</p>
<p>As a part of my studies, I'm doing some research to investigate the expectations gap between the higher education and the industry sectors; why does the university teach students this way?</p>]]></description><link>https://www.franzoni.eu/misaligned-expectations-investigating-the-expectations-gap/</link><guid isPermaLink="false">5b3e88b2ed2a7700017d8984</guid><category><![CDATA[Mostly Unixish]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Thu, 05 Jul 2018 21:12:20 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2018/07/janko-ferlic-174927-unsplash-1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2018/07/janko-ferlic-174927-unsplash-1-.jpg" alt="Misaligned Expectations: investigating the expectations gap"><p>As some of my followers already know, I'm enrolled in the great Master's program at <a href="http://www.gatech.edu/">Georgia Tech</a>, the <a href="http://www.omscs.gatech.edu/">OMSCS</a>.</p>
<p>As a part of my studies, I'm doing some research to investigate the expectations gap between the higher education and the industry sectors; why does the university teach students this way? What do students expect? And what do employers and professional?</p>
<p>Help us by answering a small survey, or just subscribe if you'd like to receive the results when the research is completed:</p>
<p><a href="https://www.misalignedtech.com/">https://www.misalignedtech.com/</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[SCP taming: stop local silliness]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>Every <s>day</s> now and then, I get an <strong>scp</strong> command wrong. Scp is designed after commands like <a href="https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.cmds4/rcp.htm">rcp</a> and works totally fine for local-to-local file copy.</p>
<p>While this can (or could) be useful in some contexts, It's not what I like to do these days; very often, if either hosts</p>]]></description><link>https://www.franzoni.eu/scp-taming-stop-local-sillyness/</link><guid isPermaLink="false">5ae1aa970acd070001258def</guid><category><![CDATA[Mostly Unixish]]></category><category><![CDATA[ssh]]></category><category><![CDATA[scp]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Thu, 26 Apr 2018 10:38:49 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Every <s>day</s> now and then, I get an <strong>scp</strong> command wrong. Scp is designed after commands like <a href="https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.cmds4/rcp.htm">rcp</a> and works totally fine for local-to-local file copy.</p>
<p>While this can (or could) be useful in some contexts, It's not what I like to do these days; very often, if either hosts is not remote, it's a typo on my part, and results in spurious &quot;<a href="mailto:somebody@host.example.com">somebody@host.example.com</a>&quot; files somewhere in various directories on my disk.</p>
<p>So, what I like to do? I put a small script like this in my path <em>before</em> the actual scp executable in order to pre-check arguments, and prevent accidental local-only copies.</p>
<pre><code class="language-bash">#!/bin/bash
ORIG=&quot;/usr/bin/scp&quot;

for var in &quot;$@&quot;
do
    if [[ $var = *&quot;:&quot;* ]]; then
        $ORIG &quot;$@&quot;
        exit $?
    fi
done

echo &quot;ERROR: Missing colon. You need to pass at least one remote host specifier in source or dest&quot;
exit 1
</code></pre>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Productivity, the office, and the open floor plan]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>There's one pattern that, nowadays, I find amusing; the <strong>productivity</strong> mantra is repeated everywhere. Everybody wants to get more productive, every company is trying to make their employees more productive. Robotics, AI: everything calls for it.</p>
<p>From Wikipedia:</p>
<blockquote>
<p>Productivity describes various measures of the efficiency of production. A productivity measure</p></blockquote>]]></description><link>https://www.franzoni.eu/productivity-the-office-and-an-open-floor-plan/</link><guid isPermaLink="false">5ad721dd70882b0001974fa6</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[productivity]]></category><category><![CDATA[industry]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Wed, 18 Apr 2018 12:43:37 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2018/04/breather-176228-unsplash-1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2018/04/breather-176228-unsplash-1-.jpg" alt="Productivity, the office, and the open floor plan"><p>There's one pattern that, nowadays, I find amusing; the <strong>productivity</strong> mantra is repeated everywhere. Everybody wants to get more productive, every company is trying to make their employees more productive. Robotics, AI: everything calls for it.</p>
<p>From Wikipedia:</p>
<blockquote>
<p>Productivity describes various measures of the efficiency of production. A productivity measure is expressed as the ratio of output to inputs used in a production process, i.e. output per unit of input</p>
</blockquote>
<p>So: <em>productivity is a ratio</em>.</p>
<p><strong>That's a great idea!</strong> Let's do more work in less time, and let's use the rest of our time in other activities - be it study, research, exercise, leisure time, or whatever.</p>
<p>Or, <strong>let's do more work in the same time as before!</strong> That's a good idea as well; we'll achieve more, and maybe, if our company recognizes such additional value, get paid more.</p>
<p>But then. <a href="https://www.inc.com/geoffrey-james/apple-employees-hate-apples-5-billion-open-plan-o.html">Open floor plans</a>. Shared desks.</p>
<p><img src="https://www.franzoni.eu/content/images/2018/04/annie-spratt-604131-unsplash-1-.jpg" alt="Productivity, the office, and the open floor plan"></p>
<p>How can those things go together?</p>
<p>Focusing on software development/tech firms, there seem to exist a <a href="https://news.ycombinator.com/item?id=16864925">widespread</a> <a href="https://www.dezeen.com/2017/08/10/apple-park-campus-employees-rebel-over-open-plan-offices-architecture-news/">discontent</a> around open floor plans, and yet the number of companies adopting them seem to be growing - I don't have exact data, but it seems to be a new trend. Traditional corporations that want to start behaving like startups start ditching private offices. The usual reason: <em>an open office workspace promotes collaboration</em>. Yes, maybe it does. But <strong>do all the employees in a company just collaborate between each other all the time?</strong></p>
<p>Most probably, they won't. Most probably, they'll need plenty of time to <strong>focus</strong> and do <a href="https://www.amazon.com/Deep-Work-Focused-Success-Distracted/dp/1455586692"><strong>deep work</strong></a>. The risk is to <strong>overoptimize</strong> for a single aim, and forget the whole picture.</p>
<p>More often than not, an open office plan is a marketing synonym for <em>let's cram how many people we can, quite at random, in the smallest possible space</em>.</p>
<p>Incidentally, I think that <strong>open office workspaces are one of the reasons<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup> that actually make 100% remote work positions effective:</strong> the productivity drop of open workspaces is larger than the drop caused by collaboration/communication difficulties in remote work situations.</p>
<p>Personally, I think that a <strong>good office trumps any other work environment/style in terms of productivity</strong>. How should a good office be organized? I have experienced a variety of work environments and, well, in my opinion a sort-of open space rooms work fine, even great, but there're some dispositions:</p>
<ul>
<li>Don't make them too large. Around 100/150m<sup>2</sup> (1000-1500 sqft) should be ok to accommodate enough people working together.</li>
<li>Reserve enough space for each worker. Usually 7-10m<sup>2</sup> (70-100 sqft) for each person is a good guidance.</li>
<li>Make sure desks are personal, and they're deep and wide enough. No desk should be narrower than 150cm (60 inches), and ideally you should aim at 180/200cm (70-80 inches). <strong>It should be possible for two people to sit at the same desk at any time</strong>, without messing with table legs, other people's legs, or any other item. <strong>Pursue no friction for collaboration.</strong></li>
<li>Provide some place (lockers, cabinets, whatever) to let people put their personal and work belongings in, so the desktop area can be tidy and clear.</li>
<li>Provide good tools. Large screens, powerful workstations, if needed.</li>
<li>Make sure that both <em>visual</em> and <em>auditory</em> noises are minimal. While at his desk, any worker should <em>not</em> be able to see their peers' monitors. Phones should be silenced. People requiring to make constant noise  (e.g. salesmen, tech support) should not be put in the same room as deep-working people (e.g. software developers). Carpeted floors and doors that can actually be closed can be very useful.</li>
<li>Respect <strong>privacy</strong> and <strong>safety</strong>. People should not fear shoulder surfing or that anybody could approach them without being seen, because that prevents many workers from getting relaxed enough to enter a <em>flow</em> state of mind.</li>
<li>Offices <strong>should not be hallways</strong>. There should be no need for somebody to cross a room just to reach another place.</li>
<li>Enforce a behavioural code. <strong>Talking is totally permitted</strong>, albeit in a low voice and not shouting around the room. I find pair programming 100% fine in such environments.</li>
<li>Provide other community space, with whiteboards, large screens, coffee, snacks, where people can meet &amp; discuss when they need to collaborate rather than focus.</li>
</ul>
<p>Using this kind of office space, you can accommodate 10-15 developers in a room, and change their position when they need to constantly work with somebody else. Collaboration turns out to work out properly, but distractions are limited. A decent trade-off.</p>
<p>What I finally argue is: if you're using an open floor plan just to cram more people in the same space, <strong>you're not improving productivity. You're (possibly) just improving total production</strong>. Because more people working in a bad environment <em>could</em> still produce more than less people in a good environment.</p>
<p>But, <strong>take some time to do some calculations</strong>. Your engineers are possibly the most expensive part of your budget. Are you totally sure that you'd want to spend a conspicuous amount of money for their wage, and then let them work at a fraction of their potential efficiency?</p>
<p>Sure, real estate and rents are a cost. A distracting work environment won't reduce their abilities 1% or 2%. I'd speculate that the productivity drop can easily reach 20%-50%. Take note of your &quot;developer density&quot; and when it's approaching the limit, start looking for new office space. Don't wait to be Too Crammed To Do Anything Useful.</p>
<p>Sometimes I hear somebody say that &quot;hey, that's how it works nowadays, adapt or be ejected, I can work with noise and distractions&quot;. Let's suppose that such people actually exist and do a lot of great work; how many of them can you recruit in the current state of talent crunch? And, are you really sacrificing a lot people that could do great work on the altar of smaller offices? Remember: when creating an office space for a team, you should think about <strong>the whole team</strong>, a lot of different people. Even though it works for you, it may not work for them.</p>
<p>UPDATE July, 20th 2018:<br>
There're more and more people that agree with me about open floor plans, in that they're detrimental to productivity and conversation:</p>
<p><a href="https://theconversation.com/a-new-study-should-be-the-final-nail-for-open-plan-offices-99756">https://theconversation.com/a-new-study-should-be-the-final-nail-for-open-plan-offices-99756</a><br>
<a href="https://code.likeagirl.io/a-research-roundup-to-show-that-your-office-layout-is-toxic-and-some-tips-for-making-it-better-8434864b0ab2">https://code.likeagirl.io/a-research-roundup-to-show-that-your-office-layout-is-toxic-and-some-tips-for-making-it-better-8434864b0ab2</a><br>
<a href="https://m.signalvnoise.com/the-open-plan-office-is-a-terrible-horrible-no-good-very-bad-idea-42bd9cd294e3">https://m.signalvnoise.com/the-open-plan-office-is-a-terrible-horrible-no-good-very-bad-idea-42bd9cd294e3</a><br>
<a href="https://joshuatdean.com/wp-content/uploads/2020/02/NoiseCognitiveFunctionandWorkerProductivity.pdf">https://joshuatdean.com/wp-content/uploads/2020/02/NoiseCognitiveFunctionandWorkerProductivity.pdf</a></p>
<p>Photos by <a href="https://unsplash.com/@breather?utm_medium=referral&amp;utm_campaign=photographer-credit&amp;utm_content=creditBadge">Breather</a> and <a href="https://unsplash.com/@anniespratt">Annie Spratt</a></p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>The other one being long commutes that waste time and destroy workers' morale. <a href="#fnref1" class="footnote-backref">↩︎</a></p>
</li>
</ol>
</section>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Command line data crunching with Python]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>Every time I'm doing some data crunching on the command line, I find myself juggling between <em>sed, awk, sort, uniq, etc</em>. While I like <em>the UNIX way</em> of having one tool doing one thing well, I sometimes find it slightly boring to put all the tools together, sometimes stretching their</p>]]></description><link>https://www.franzoni.eu/command-line-data-crunching-with-python/</link><guid isPermaLink="false">5a843ff69e2ba90001309b55</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[python]]></category><category><![CDATA[datacrunching]]></category><category><![CDATA[Mostly Unixish]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Wed, 14 Feb 2018 14:44:51 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2018/02/daniel-cheung-129839-1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2018/02/daniel-cheung-129839-1-.jpg" alt="Command line data crunching with Python"><p>Every time I'm doing some data crunching on the command line, I find myself juggling between <em>sed, awk, sort, uniq, etc</em>. While I like <em>the UNIX way</em> of having one tool doing one thing well, I sometimes find it slightly boring to put all the tools together, sometimes stretching their features a bit too much.</p>
<p>I know that Perl and Ruby support implicit loops / prints - see <a href="http://www.wellho.net/resources/ex.php4?item=p210/spot">this</a> and <a href="http://www.wellho.net/resources/ex.php4?item=r110/ruby_awk.rb">that</a>. Those switches makes it easy to work with data on the command line, but I don't use those languages a lot anymore, so I always need to lookup something in online manuals before performing something useful. And I never took my time to learn awk properly, so maybe I wouldn't need al that.</p>
<p>On the contrary, I still use Python quite a lot, and it's becoming the de-facto standard for data science purposes. Using it on the command line by piping something in &amp; out of it, by the way, isn't always so easy - the <code>-c</code> switch allows passing a command in, but it's not always easy to understand whether a char is being interpreted by bash or by the python interpreter, and Python is whitespace-sensitive, too. So a command line like:</p>
<pre><code>$ python -c 'import sys;for x in sys.stdin:print x'
</code></pre>
<p>won't &quot;just work&quot;:</p>
<pre><code>  File &quot;&lt;string&gt;&quot;, line 1
    import sys;for x in [1,5]:    print x;print x
                 ^
SyntaxError: invalid syntax
</code></pre>
<p>But: there's a <a href="https://stackoverflow.com/questions/11966312/how-does-the-leading-dollar-sign-affect-single-quotes-in-bash">bash feature to interpret escape sequences in single-quoted strings</a>, so this will work fine:</p>
<pre><code>$ echo -e &quot;hello\nworld\nthis\nis\nme&quot; | python -c $'import sys\nfor x in sys.stdin:\n    print(x.strip())'
hello
world
this
is
me
</code></pre>
<p>I find Python string manipulation to be great and usually fast-enough for not-so-large datasets, so you can do very interesting things and shell out to standard unix commands only if and when you actually need to. As long as you rely on the standard lib only, you're quite safe about portability, too.</p>
<p><strong>AN IMPORTANT NOTE:</strong> if you're treating non-ascii data, I suggest you set the <em>PYTHONIOENCODING</em> variable, especially if you're using Python3, since that interpreter version converts to unicode objects wherever it is possible:</p>
<pre><code>echo -e &quot;ààà\nworld\nthis\nis\nme&quot; | PYTHONIOENCODING='utf-8' python3 -c $'import sys\nfor x in sys.stdin:\n    print(x.strip())'
ààà
world
this
is
me
</code></pre>
<p>Enjoy your command line! And if you want to become a command line data processing guru, I cannot recommend <a href="https://www.datascienceatthecommandline.com/">this book</a> enough.</p>
<p><em>Photo by Daniel Cheung on <a href="https://unsplash.com/photos/cPF2nlWcMY4">Unsplash</a></em></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Git: automatically set and use multiple commit identities]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>Sure, git is great. Sure it is <a href="http://www.thecodedself.com/Using-Multiple-Author-Identities-With-Git/">possible</a> to use multiple commit identities in git - just set local per-repo variables. If it weren't for the fact that chances of forgetting about it is about 99.9%.</p>
<h4 id="problemcontext">Problem context</h4>
<p>When installing git, it is required to <a href="https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup">configure a username and</a></p>]]></description><link>https://www.franzoni.eu/git-identities/</link><guid isPermaLink="false">5a03880317ddee0011bb177d</guid><category><![CDATA[Mostly Unixish]]></category><category><![CDATA[git]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Thu, 12 Oct 2017 13:24:21 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2017/10/andre-hunter-350301-1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2017/10/andre-hunter-350301-1-.jpg" alt="Git: automatically set and use multiple commit identities"><p>Sure, git is great. Sure it is <a href="http://www.thecodedself.com/Using-Multiple-Author-Identities-With-Git/">possible</a> to use multiple commit identities in git - just set local per-repo variables. If it weren't for the fact that chances of forgetting about it is about 99.9%.</p>
<h4 id="problemcontext">Problem context</h4>
<p>When installing git, it is required to <a href="https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup">configure a username and an email</a> that end up in commit logs:</p>
<pre><code>commit b8e77cefcceccddbc74cb13348ccc51d9128872f
Author: Alan Franzoni &lt;username@franzoni.eu&gt;
Date:   Thu Aug 3 16:35:53 2017 +0200

    Drop fc24, add fc26 support
</code></pre>
<p>I've got a work computer, and a personal one, but I often do some work from my own machine, and sometimes (i.e. during lunch breaks or commuting) I do some personal-related work from my work machine. So, when I clone a personal project on my work machine or vice-versa, then commit anything, <strong>I end up with mixed up work-related email &amp; usernames</strong>.</p>
<p>Sure, nothing tragic, but annoying - especially if CI systems pickup commit logs for sending error mails, and I don't realize I've broken a build until it's too late.</p>
<h4 id="thestupidbuteffectivesolution">The stupid but effective solution</h4>
<p>On my personal workstation, I set my personal info for the global config, then override my git executable this way:</p>
<pre><code class="language-bash">#!/bin/bash
GIT_BIN=&quot;/usr/local/bin/git&quot;
# include the trailing slash, not a leading one for ssh/https compatibility
GITHUB_ORGNAME=&quot;MyMightyOrganization/&quot;
OVERRIDE_USERNAME=&quot;Alan Franzoni&quot;
OVERRIDE_EMAIL=&quot;alan@mymightyorganization.com&quot;
if [ &quot;$1&quot; != &quot;clone&quot; ] || grep -v &quot;${GITHUB_ORGNAME}&quot; &lt;&lt;&lt; &quot;$@&quot; ; then
	${GIT_BIN} &quot;$@&quot;
	EXIT_CODE=&quot;$?&quot;
	exit ${EXIT_CODE}
fi

shift

${GIT_BIN} clone --config user.name=&quot;${OVERRIDE_USERNAME}&quot; --config user.email=&quot;${OVERRIDE_EMAIL}&quot; &quot;$@&quot;
EXIT_CODE=&quot;$?&quot;
exit ${EXIT_CODE}
</code></pre>
<br>
I leverage the fact that it's possible to set per-repository config values that override global ones, and I hook into the clone command and set those values at clone time, so I don't risk forgetting about it. Since it's a repository property that gets set, it will work in other clients, GUIs or IDEs as well.
<p><em>Photo by <a href="https://unsplash.com/@dre0316">Andre Hunter</a></em></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Shell scripting: short or long format options?]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>This is something I get asked quite a lot, so I wanted to write a piece about it.</p>
<p>This is an extract from the manpage from GNU grep:</p>
<pre><code>NAME
       grep, egrep, fgrep, rgrep - print lines matching a pattern

SYNOPSIS
       grep [OPTIONS] PATTERN [FILE...]
       grep [OPTIONS] [-e PATTERN]...  [-f FILE]</code></pre>]]></description><link>https://www.franzoni.eu/shell-scripting-when-to-use-the-long-or-the-shor/</link><guid isPermaLink="false">5a03880317ddee0011bb177b</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[shell]]></category><category><![CDATA[bash]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Thu, 07 Sep 2017 13:23:00 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2017/09/alex-holyoake-213910--1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2017/09/alex-holyoake-213910--1-.jpg" alt="Shell scripting: short or long format options?"><p>This is something I get asked quite a lot, so I wanted to write a piece about it.</p>
<p>This is an extract from the manpage from GNU grep:</p>
<pre><code>NAME
       grep, egrep, fgrep, rgrep - print lines matching a pattern

SYNOPSIS
       grep [OPTIONS] PATTERN [FILE...]
       grep [OPTIONS] [-e PATTERN]...  [-f FILE]...  [FILE...]

DESCRIPTION
       grep  searches  the  named  input  FILEs  for lines containing a match to the given PATTERN.  If no files are specified, or if the file “-” is given, grep searches standard input.  By
       default, grep prints the matching lines.

       In addition, the variant programs egrep, fgrep and rgrep are the same as grep -E, grep -F, and grep -r, respectively.  These variants are deprecated, but  are  provided  for  backward
       compatibility.

OPTIONS
   Generic Program Information
       --help Output a usage message and exit.

       -V, --version
              Output the version number of grep and exit.

   Matcher Selection
       -E, --extended-regexp
              Interpret PATTERN as an extended regular expression (ERE, see below).

       -F, --fixed-strings
              Interpret PATTERN as a list of fixed strings (instead of regular expressions), separated by newlines, any of which is to be matched.

       -G, --basic-regexp
              Interpret PATTERN as a basic regular expression (BRE, see below).  This is the default.

       -P, --perl-regexp
              Interpret the pattern as a Perl-compatible regular expression (PCRE).  This is highly experimental and grep -P may warn of unimplemented features.
</code></pre>
<p>So, <code>-F</code> is a short format form, while <code>--fixed-strings</code> is a long format form.</p>
<p>When should I use either form?</p>
<p>My take is: when working interactively, do whatever you like. Whenever you are writing a <strong>script to be reused</strong>, strive to use the <strong>long format options as much as possible</strong>.</p>
<p>Why?</p>
<p>Nobody cares about what you do on your computer. But whenever you write a script that should, later on, be used and read by other people - or just yourself in the future - <strong>the long form is often self-explanatory</strong>. Instead of opening the manpage and looking for cryptic <code>-K -X -i</code>, you can just read the option! It will for sure save more time than what you spend while typing a few more chars.</p>
<p>There're a few exceptions to this rule:</p>
<ul>
<li>if the option is very common. Feel free to use <code>-E</code> and <code>-P</code> for <code>grep</code>, as they're very well known and widespread options for this command.</li>
<li>if you need portability between different OSes (e.g. MacOS+Linux+BSD) and the long format option is not available somewhere; this sometimes happens.</li>
</ul>
<h4 id="whyisthiscool">Why is this cool?</h4>
<p>Consider this command, which is the recommended way to install the great MacOS package manager <a href="https://brew.sh">Homebrew</a>:</p>
<pre><code>/usr/bin/ruby -e &quot;$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)&quot;
</code></pre>
<p>Such <code>curl</code> command is quite common in many installers; <strong>can you tell what such options do?</strong> Can't you? Consider this line:</p>
<pre><code>/usr/bin/ruby -e &quot;$(curl --fail --silent --show-error --location https://raw.githubusercontent.com/Homebrew/install/master/install)&quot;
</code></pre>
<p>I bet you can now understand what it does; maybe the last flag is a bit vague (it tells <code>curl</code> to follow redirects via <code>Location</code> headers), but the overall feeling is simply... great!</p>
<p><em>Photo by Alex Holyoake on Unsplash</em></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Stopping The Internet Of Noise - A Useful Internet Back Again]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>The internet is getting noisy. Too noisy. Having grown up in the nineties, with 56k dial-up, I sometimes struggle to understand how little I'm accomplishing today with all the bandwidth I can leverage.</p>
<p>There were some key factors that made the old internet so productive, by the way, and many</p>]]></description><link>https://www.franzoni.eu/stopping-the-internet-of-noise/</link><guid isPermaLink="false">5a03880317ddee0011bb177a</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[rants]]></category><category><![CDATA[thoughts]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Tue, 04 Jul 2017 21:48:00 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2017/07/jonathan-velasquez-160775--1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2017/07/jonathan-velasquez-160775--1-.jpg" alt="Stopping The Internet Of Noise - A Useful Internet Back Again"><p>The internet is getting noisy. Too noisy. Having grown up in the nineties, with 56k dial-up, I sometimes struggle to understand how little I'm accomplishing today with all the bandwidth I can leverage.</p>
<p>There were some key factors that made the old internet so productive, by the way, and many of those factors are just gone.</p>
<p>This is not just a rant. I have some proposals as well.</p>
<h5 id="discussing">Discussing</h5>
<p>Most discussions on the internet once happened in <a href="https://en.wikipedia.org/wiki/Usenet">Usenet</a> newsgroups and mailing lists. You could even access mailing lists via NNTP (the newsgroup protocol) using a great service like <a href="http://gmane.org/">gmane</a>.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/8/87/40tude_Dialog.png" alt="Stopping The Internet Of Noise - A Useful Internet Back Again"></p>
<p>40tude Dialog was my favourite newsreader at the time. I could subscribe to a lot of different newsgroups, <strong>without needing to sign up for each of them</strong>, and it was very easy to catch up with lots of traffic - just a matter of checking which messages had still the title in bold.</p>
<p><strong>There was a single application with a consistent interface</strong> to access the most disparate groups, and <strong>topics were widely enforced</strong> - probably the most common complaint from newsgroup users was that somebody was <em>off topic</em> - such discussions were quickly criticized and got little attention.</p>
<h5 id="chatting">Chatting</h5>
<p>There once was IRC, the Internet Relay Chat. <a href="http://www.mirc.com/get.html">mIRC</a> was the king of clients for a while.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/f/ff/Smuxi-0.7.2-linx-main-window.png" alt="Stopping The Internet Of Noise - A Useful Internet Back Again"></p>
<p>IRC, sometimes, was not great, lacking some advanced features for multimedia and emojis, but it was good. There were plenty of channels with specific topics, and you could join them in a second.</p>
<p>Then, other instant messengers picked up - ICQ, MSN, AOL, etc - but it wasn't a great issue; the protocols they used were mostly simple, and tools like <a href="https://www.trillian.im/">Trillian</a> and <a href="https://pidgin.im/">Pidgin/Adium</a> quickly reversed them and let the users just pick one client with multiple connections. Skype was a bit of a white fly because its original protocol was harder to reverse.</p>
<p>Again, <strong>there was an easy way to get a consistent interface in a single application</strong> to chat on IRC, Google Chat, Facebook Chat, generic XMPP, and so on.</p>
<h5 id="gettingnotifiedaboutupdates">Getting notified about updates</h5>
<p>Oh, this is was easy, wasn't it? <a href="https://en.wikipedia.org/wiki/RSS">RSS</a> and <a href="https://en.wikipedia.org/wiki/Atom_(standard)">Atom</a> just ruled the world. <a href="https://en.wikipedia.org/wiki/Bloglines">Bloglines</a> and the beloved <a href="https://www.google.com/reader/about/">Google Reader</a> were the first stop, every morning, for many developers.</p>
<p>Some blogger just putting too much shit on its blog? We remove him from our feed! No matter what!</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/b/b7/BazQux_Reader_screenshot.png" alt="Stopping The Internet Of Noise - A Useful Internet Back Again"></p>
<p>We could simply pull <strong>our update sources</strong> within one single application, and check all the updates when we had the time to.</p>
<p>I can remember my workflow: scroll through the sources, open any interesting article in a new tab, mark all feeds as read; then go through the tabs until I finished. Job done. I could, at the same time, read a lot of interesting thing and discard a lot of uninteresting shit.</p>
<h5 id="whathappened">What happened</h5>
<p>If you take a look at was the state of things, you can notice that:</p>
<ul>
<li><strong>Focus is lost</strong>; once, everything went around <strong>topics</strong>. Channels had topics; newsgroups had topics; blog had topics (one great rule for a blog was to have a 'razor sharp' focus). <strong>Nowadays we have <em>people</em> instead of topics</strong>. I have nothing against people, but maybe, if I follow a great software architect, I'd like to hear what he's got to say about software, not about other shits.</li>
<li><strong>There are multiple platforms lacking an API</strong>: nowadays I'm forced to switch between Slack, Telegram, Whatsapp, Discord, Skype and whatever IM is going to be the most hip tomorrow. There are hundreds of forums - maybe the retain a topic, but they don't offer a consistent way of pulling content into a single application. Facebook and Twitter - while there used to be some apps that allowed integrating such feeds into a single app, they're long gone as far as I know. StackOverflow mitigates the issue with an acceptable RSS feed, even though it's not as customizable as I'd like.<br>
Many websites just <strong>stopped offering RSS feeds at all</strong>, or stopped making them customizable, and just push their updates - ALL of their updates - on Facebook and Twitter. I used to be an avid <a href="http://lifehacker.com/">Lifehacker</a> reader, back when they had category-based RSS feeds, but at once... they just stopped providing such service.</li>
<li><strong>Can't mark things as read</strong>. Have you ever noticed that? <a href="http://www.lifehack.org/articles/lifehack/ultimate-way-inbox-zero.html">Inbox Zero</a> once was a good practice for every workflow. <strong>Facebook and Twitter simply don't allow this functionality</strong>. You cannot mark a tweet or post as read, it could resurface at any time. And you cannot see which posts or tweets you've not read yet - probably because the amount would be simple overwhelming. <strong>You need to waste your time on such websites</strong>.</li>
</ul>
<h5 id="wherewewanttogo">Where we want to go</h5>
<p>We want to reverse all this. We need:</p>
<ul>
<li><strong>Topics</strong>. Google Plus created somethings similar to that with Collections (without RSS, of course); or <strong>we could just create a blog or username for each of our topics</strong> - I think most of us won't discuss about so many totally unrelated different fields. It's a change of mentality - we shouldn't write something just because we can. Unless we are celebrities, people, especially strangers, won't follow us just for the sake of it - <strong>we need actual, quality content</strong>. Smallchat is fine on FB or Twitter.</li>
<li><strong>APIs</strong>. I'm not saying we should get back to IRC or to NNTP. <strong>But we need a common API for Instant Messaging and forum-like software</strong>, so that people can use their favourite tools to organize their data sources. Installing tens of apps or visiting tens of websites every day is not an option.</li>
</ul>
<p>Incidentally, this is not a call for the open internet; I could not care less if there's a leading provider for content, <strong>as long as such content is accessible in a standard way</strong>.</p>
<p>This is a call for <strong>a useful internet back again</strong>.</p>
<p>(Incidentally, this is not the usual focus of this blog, which is mostly technical. But I hope my audience will forgive me, this is something important, IMHO).</p>
<p>EDIT:<br>
an interesting discussion arose on <a href="https://news.ycombinator.com/item?id=14698545">Hacker News</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Students that don't "get" computer science - bimodality as a teaching failure]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>There seems to be a common belief about computer science: people either <em>get it</em>, or <em>don't get it</em>.</p>
<p>A recent paper by the University of Toronto, <a href="http://www.cs.toronto.edu/dcs/documents/p113-patitsas.pdf">Evidence That Computer Science Grades Are Not Bimodal</a>, dispels such myth.</p>
<p>The paper even suggests some explanations for this myth: <strong>teaching failure</strong>. We don't</p>]]></description><link>https://www.franzoni.eu/students-that-dont-get-computer-science-bimodality/</link><guid isPermaLink="false">5a03880317ddee0011bb1778</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[teach]]></category><category><![CDATA[cs]]></category><category><![CDATA[industry]]></category><category><![CDATA[academy]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Fri, 17 Mar 2017 08:29:00 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2017/03/gareth-harper-128701--1-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2017/03/gareth-harper-128701--1-.jpg" alt="Students that don't "get" computer science - bimodality as a teaching failure"><p>There seems to be a common belief about computer science: people either <em>get it</em>, or <em>don't get it</em>.</p>
<p>A recent paper by the University of Toronto, <a href="http://www.cs.toronto.edu/dcs/documents/p113-patitsas.pdf">Evidence That Computer Science Grades Are Not Bimodal</a>, dispels such myth.</p>
<p>The paper even suggests some explanations for this myth: <strong>teaching failure</strong>. We don't know how to teach computer science; so, it seems that some people succeed at learning <em>despite teachers' inability to convey CS concepts</em>.</p>
<p>I think the idea is interesting and very appropriate; I have some opinions of mine:</p>
<ul>
<li>We should make people <em>enjoy</em> computer science! A lot of students get their grades and <strong>they can't write ten lines of code</strong>! But coding is actually the most interesting and amusing part of computer science, where you can actually <strong>do things</strong> and see how they work. <strong>Let's put more tinkering into CS courses!</strong></li>
<li>We should put some effort at making CS <em>cool</em>. As long as people regard CS as boring, they don't take that journey.</li>
</ul>
<p>Otherwise, we risk an even further split between <em>practitioners</em>, those that <em>learn by doing</em> and that sometimes don't have a formal education, and the <em>academics</em>, that live in the walled garden of universities and have little to no contact with the industry.</p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Primitive types are not your friends]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>Really. Stop (ab)using just strings and integers!</p>
<p>Yes, they're the building blocks of whatever you do. But if you abuse them, you're not taking full advantage of your Object Oriented programming language.</p>
<p>Let me explain with some concrete example. For the sake of conciseness, I'll use Python here, but</p>]]></description><link>https://www.franzoni.eu/primitive-types-are-not-your-friends/</link><guid isPermaLink="false">5a03880317ddee0011bb1775</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[programming]]></category><category><![CDATA[type system]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Fri, 17 Feb 2017 22:45:00 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2017/02/matt-briney-169042--2-.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2017/02/matt-briney-169042--2-.jpg" alt="Primitive types are not your friends"><p>Really. Stop (ab)using just strings and integers!</p>
<p>Yes, they're the building blocks of whatever you do. But if you abuse them, you're not taking full advantage of your Object Oriented programming language.</p>
<p>Let me explain with some concrete example. For the sake of conciseness, I'll use Python here, but the concepts fit most OO languages. <strong>Classes are your friends.</strong></p>
<p>Really. Classes and type systems weren't invented to make you feel miserable and add bloating to your code. They were invented to <strong>help</strong> you to write correct code, letting the machine verify you're doing things properly and consistently.</p>
<p>But if you don't use them, the machine can't help you. And no, dynamically (yet strongly typed) languages aren't a good reason to skip correctness and proper OO design.</p>
<p>Let's start with some examples:</p>
<pre><code class="language-python">def retrieve_web_resource(http_url):
    #[...]
    requests.get(...)
</code></pre>
<p>What is that <em>http_url</em> ? In many contexts, it will probably be a string. <strong>But this is not a best practice</strong>. Primitive types like strings carry <strong>very little semantics</strong> and probably <strong>do not enforce domain boundaries</strong>. What does that mean?</p>
<p>It means that somewhere in your code you have something like</p>
<pre><code class="language-python">    my\_http\_url = somewhere.create_url_from()
</code></pre>
<p>(with <em>my_http_url</em> still being of type <strong>str</strong>) and then you carry it around as a str. Maybe, at some point, you <strong>manipulate</strong> it. And, possibly because of some error or because some unexpected data was thrown in, at some point it <strong>ceases being a valid http url</strong>.</p>
<p>This might mean that, in many parts of your code, whenever you use such value, you may be tempted to <strong>check some preconditions:</strong></p>
<pre><code class="language-python">def retrieve_web_resource(http_url):
    assert http_url.starswith(&quot;http&quot;), &quot;not an http url&quot;
</code></pre>
<p>But then, <strong>you need to check those preconditions everywhere!</strong></p>
<p>And, if you happen to pass your variable around, and at some point you pick a less-than-meaningful variable name, it may be hard to understand, when debugging or if a serious error happened that totally messed up the content, what that variable was supposed to contain.</p>
<p>So, what is a much better solution? Create the right type, <strong>always</strong>, and use those types in your application. Example:</p>
<pre><code class="language-python">class HTTPURL(object):
   def __init__(self, scheme, host, path, query=None):
       if scheme != &quot;http&quot;:
            raise ValueError, &quot;not an http url&quot;
       # validate host
       # validate path
       # validate query - should be a param-value dict
       self._host = host
       self._path_parts = path_parts
       self._query = query if query is not None else {}

   @classmethod
   def parse_from_str(cls, http_url_as_str):
       # parse the url, then
       return cls(scheme, host, path, query)

   def add_query_param(self, param, value):
       # add param and value to dictionary, return a new
       # HTTPURL object with that additional value. Raise
       # an error if such value is already defined

   def append_path(self, additional_path):
       # properly append additional_path, checking for 
       # missing or duplicate initial slashes, etc, then
       # return a new HTTPURL object.
    
   def __str__(self):
       # create the url from current variables, and return it as
       # a str

</code></pre>
<p>(this is simplified - you might want to add auth data, port, etc)</p>
<p>See the point? You're making sure that <strong>no invalid http_url instance can exist</strong>, and you're providing <strong>proper methods to manipulate such instance</strong>. This way you're encapsulating your domain knowledge about URLs in a specific class, instead of scattering and duplicating it everywhere and you're making it really hard for users to commit accidental mistakes - they would be caught quickly.</p>
<p>The HTTPURL we've seen here is composite and this approach may make a lot of sense, but in Python you can even use <a href="http://www.markphelps.me/2014/12/09/tiny-types.html">Tiny Types</a> in a very, VERY efficient fashion, when you just need to add some semantics and some validation without a full-fledged class because you just don't need that kind of manipulation:</p>
<pre><code class="language-python">class Angle(int):
    def __new__(cls, v):
        if isinstance(v, Angle):
            return v
        if v &lt; 0 or v &gt; 360:
            raise ValueError(&quot;invalid angle, must be within 0 and 360&quot;)
        if not isinstance(v, (int, long)):
            raise TypeError(&quot;invalid angle type, must be int or long&quot;)
        return super(Angle, cls).__new__(cls, v)
</code></pre>
<p>See? The instances you create are of type <strong>Angle</strong>, but they're still an <strong>int</strong>! So you can <strong>pass them straight</strong> to other apis/libraries that just don't support or know nothing about your custom types, and yet you get <strong>semantics and validation</strong>!</p>
<p>Of course, if you need more operations you'll need to redefine more methods:</p>
<pre><code class="language-python">class Angle(int):
    def __new__(cls, v):
        if isinstance(v, Angle):
            return v
        if v &lt; 0 or v &gt; 360:
            raise ValueError(&quot;invalid angle, must be within 0 and 360&quot;)
        if not isinstance(v, (int, long)):
            raise TypeError(&quot;invalid angle type, must be int or long&quot;)
        return super(Angle, cls).__new__(cls, v)

    def __add__(self, other):
        return Angle(int(self) + int(Angle(other)))

angle = Angle(10)
angle2 = Angle(2)
angle3 = Angle(angle2)

print angle+angle2
print type(angle+angle2)
</code></pre>
<pre><code>12
&lt;class '__main__.Angle'&gt;
</code></pre>
<p>Do you think all that is too verbose, and you'd just prefer to stick to string and ints? Well... I suggest you try this approach, check how many bugs you prevent and/or how your code becomes more expressive and cohesive, and then think again about verbosity. <strong>If your code is more compact at the expense of correctness and automated checking, maybe that compactness isn't worth it.</strong></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Enabling Process vs Bureaucratic Process]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>The Process, today, is king. There're a lot of people focusing just on process, and a lot of them hold it as The True Solution to all the world's problems.</p>
<p>Then, there're those who don't believe in process. They say, <strong><a href="http://agilemanifesto.org/">we mostly focus on people!</a> Process is a failed attempt</strong></p>]]></description><link>https://www.franzoni.eu/enabling-process-vs-bureaucratic-process/</link><guid isPermaLink="false">5a03880317ddee0011bb1774</guid><category><![CDATA[Ollivander]]></category><category><![CDATA[process]]></category><category><![CDATA[bureaucracy]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Thu, 17 Nov 2016 22:06:00 GMT</pubDate><media:content url="https://www.franzoni.eu/content/images/2016/11/photo-1416339684178-3a239570f315.jpeg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://www.franzoni.eu/content/images/2016/11/photo-1416339684178-3a239570f315.jpeg" alt="Enabling Process vs Bureaucratic Process"><p>The Process, today, is king. There're a lot of people focusing just on process, and a lot of them hold it as The True Solution to all the world's problems.</p>
<p>Then, there're those who don't believe in process. They say, <strong><a href="http://agilemanifesto.org/">we mostly focus on people!</a> Process is a failed attempt at commoditizing individuals</strong>; a lot of large organizations survive <em>despite</em> their processes, not because of them!</p>
<p>But then... you start noticing something. When you speak with agile folks, they talk about <strong>guidelines</strong>. You talk about Google, which is the poster child of the New Enterprise, where people are happy and aren't burdened by strange processes, and you discover that, when designing <a href="http://shop.oreilly.com/product/0636920041528.do">Site Reliability Engineering</a> practices, <strong>checklists</strong> are considered an excellent tool.</p>
<p>So?</p>
<p>So, I think that <strong>process is useful</strong>. Sometimes, it's <strong>overly so</strong>. Yet, <strong>it must be the right kind of process</strong>.</p>
<p>Let's see a couple examples for an example process to request some days off.</p>
<pre><code>Example A

In order to request one or more days off, you should send an email stating which days you'd like not to work to both your direct supervisor and the HR department. The request should be sent at least 10 business days before the first day off you'd like to take. Your direct supervisor has got five business days to either confirm or reject your request; if he doesn't answer, your request is automatically approved.

Another process clearly defines who's your supervisor and who takes its place should he be on vacation or ill.
</code></pre>
<pre><code>Example B

In order to request one or more days off, you should download the proper *Ask For Some Days Off Form*, fill it in completely, sign it, and send it to your supervisor via email.
</code></pre>
<h3 id="enterthebureaucracy">Enter the bureaucracy</h3>
<p>I'm confident that most corporate environments follow Example B rather than Example A. Example B is the perfect bureaucratic process: <strong>while it is short and it doesn't seem overly complicated at a first glance, it is nebulous, and it may enable nothing</strong>.</p>
<p>Where should you download the right form? Most probably it won't be linked from the policy page. You'll need to find it <em>somewhere</em>. And, if you finally ask the template to your colleagues, make sure it's the latest, approved version. Then you fill it in. How? Maybe it's a PDF, and you need to print it, fill the required days in a box, sign it, scan it back and send it via email.</p>
<p>And then, what happens? <strong>Maybe, nothing</strong>. Maybe your supervisor is ill, and won't read emails. Maybe he isn't, but he won't check immediately whether there's somebody who can do your job while you're away, and will forget ever after. <strong>There's nothing in such process which guarantees an outcome</strong>. There's no explicit responsibility transfer - your supervisor won't risk anything if he doesn't answer your request; you're the only one who's impacted; maybe you won't know anything until the day before you'd like to leave!</p>
<h3 id="agoodenablingprocess">A good, enabling process</h3>
<p>A good process <strong>will make it clear why it exists</strong>; we should not create processes for their own sake. A good process establishes a clear guideline to reach a target. And <strong>we should not create processes before we need them</strong>! A special request which is made once every couple of years by a single employee can be handled manually.</p>
<p>A good process is <strong>easy to follow and is not unnecessarily complicated</strong>; it won't use forms, software or other specialized tools when simple ones will fit perfectly.</p>
<p>You <strong>always know who's in charge at a certain phase of a good process</strong>; it's impossible (or rare and highly unlikely) for the process to get stuck without somebody taking care. Inaction is prevented by a <em>default action</em> which allows the process to go on without somebody's consent, if he's unwilling to enter the loop.</p>
<p>A good process <strong>either produces the desired outcome, or fails as fast as possible</strong>.</p>
<h3 id="anevenbetterprocess">An even better process</h3>
<p>An even better process is the one which makes it clear <strong>which business rules are applied at each approval step</strong>; using again the example above, this could be:</p>
<pre><code>For each team, it is required that at least two members to be in the office at all time, in order to handle the basic support levels needed by customers; to achieve such requirement and cope with illness or other emergencies, no days off will be allowed if that would reduce the number of working team members to a number lower than three.

In order to request one or more days off, you should send an email stating which days you'd like not to work to both your direct supervisor and the HR department. The request should be sent at least 10 business days before the first day you'd like off. Your direct supervisor has got five business days to either confirm or reject your request; if he doesn't answer, your request is automatically approved.

Another process clearly defines who's your supervisor and who takes its place should he be on vacation or ill.
</code></pre>
<p>This process is probably close to great; people in the team will probably auto-organize with their days off, and won't request things that will inevitably be denied, reducing HR and supervisor's work.</p>
<p>Let's make sure the <a href="https://www.amazon.com/Adrenaline-Junkies-Template-Zombies-Understanding/dp/0932633676">Template Zombies</a> stay dead, this time!</p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[badblocks: test your mass storage!]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>Are you unsure whether your mass storage works properly? Do you suspect it's faulty somewhere?</p>
<p>Verify it with a nifty tool, included in most Linux distros and in macOS if you install the ext2fs tools:</p>
<pre><code>sudo badblocks -swv -t random -b 4096 -c 2048 &lt;DEVICE&gt;
</code></pre>
<p>It will perform</p>]]></description><link>https://www.franzoni.eu/badblocks-test-your-mass-storage-2/</link><guid isPermaLink="false">5a03880317ddee0011bb1772</guid><category><![CDATA[Mostly Unixish]]></category><category><![CDATA[tips]]></category><category><![CDATA[bash]]></category><dc:creator><![CDATA[Alan Franzoni]]></dc:creator><pubDate>Mon, 26 Sep 2016 10:29:58 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>Are you unsure whether your mass storage works properly? Do you suspect it's faulty somewhere?</p>
<p>Verify it with a nifty tool, included in most Linux distros and in macOS if you install the ext2fs tools:</p>
<pre><code>sudo badblocks -swv -t random -b 4096 -c 2048 &lt;DEVICE&gt;
</code></pre>
<p>It will perform a read-write test of your mass storage device, and tell you what's wrong.</p>
<p><strong>WARNING:</strong> everything will be deleted during the test! Make sure you've already backupped your data!</p>
<!--kg-card-end: markdown-->]]></content:encoded></item></channel></rss>