GitHub - The Full Stop Thought

I was recently at a conference where I started chatting with a computer science graduate about job hunting in the world today. We discussed the job market landscape as it exists today and all of the economic influences, disruptive technology (cough, AI, cough), and competition out there.

Old School Cell Phone

When I first jumped into my first professional job out of college, the world was entirely a different place. Social media was still an infant, we were rapidly approaching the dot-com bubble, and we were a few years away from (real) smartphones becoming available. In this conversation, I reflected on what challenges I faced then and wondered how I would react to the demands and problems faced in the current technology climate.

Having been on both sides of the aisle when it comes to the interview process as an interviewee and building one of the best teams as an interviewer, I thought it might be good to share the conversation I had and also expand on it a little further having had a little more time to think about it. Let’s dive into it…

How LinkedIn Connects Candidates to Employers

LinkedIn has become the go-to platform for recruiters seeking top talent, but it’s evolving beyond a simple job board. The platform has adjusted its algorithm to prioritize actively engaged candidates – those who post updates, comment on other people’s posts, and interact with their network. Simply having a profile and being logged in isn’t enough anymore. Recruiters want to connect with individuals who use the platform daily in the event an opportunity finds its way into your inbox, and, second, that you demonstrate interest and expertise in their field through their activity.

Look at this from the recruiter’s and LinkedIn’s point of view. If you are paying buckets of cash for LinkedIn Hire to find a candidate for an open position, you, as a recruiter, ideally want a response to each message sent. Also, LinkedIn doesn’t want to connect individuals to a recruiter who might not respond. This is the summary of the interaction right here… full stop. This shift means job seekers need to rethink their approach.

Be Active On Social Media

Staying visible requires active participation, from sharing industry insights to engaging with thought leaders. Those who embrace this shift can significantly increase their chances of being noticed and approached for opportunities. In contrast, passive candidates who only update their profiles when job hunting may find themselves overlooked. Being “active” on any job platform (especially LinkedIn) usually means you will reply to an inquiry for an open job.

AI Will Kill the Resume

The rise of AI tools has transformed the job application process, making it easier than ever for candidates to create tailored resumes that align perfectly with job descriptions. Tools like ChatGPT can generate highly customized resumes that match job listings with striking accuracy. I haven’t done this myself because I am very selective of the positions that I am seeking. Still, as someone more open to different types of work, using a prompt that mashes your resume and the job description together, I am guessing it might put you at the top of the list.

I have a set of skills

However, this has created a significant challenge for recruiters… many candidates look great on paper but lack the actual skills needed for the job. This trend has led to an increase in candidates getting through the initial screening, only to falter during technical interviews or practical assessments. I see a lot of chatter on subreddits where it’s been very difficult to land a job, let alone get a call back after the first interview. As AI-driven resume generation becomes more common, companies will need to adopt new strategies to verify a candidate’s true abilities before moving forward in the hiring process.

As someone who has helped build teams, it’s VERY time consuming hiring people. The time spent on the interview process is time spent away from doing my actual day-to-day tasks; unfortunately, that work doesn’t stop just because I am interviewing candidates. Even back then, I was very selective about the individuals who got an email for an interview.

How Do You Prove Competence?

With AI making it easier to “embellish” resumes, the challenge for employers is determining whether a candidate truly possesses the skills they claim. Just as students can use AI to complete homework without fully understanding the material, job seekers can list expertise they don’t genuinely have or may just have passing knowledge in. This presents a costly dilemma for businesses… how do they identify qualified individuals without wasting resources on lengthy interview processes?

Be Active On Social Media

Organizations are adopting different screening mechanisms, such as skill assessments, project-based evaluations, and real-world problem-solving tests. Conducting multiple rounds of interviews can be expensive and inefficient, so refining the process to quickly filter out candidates that may not be a good match is crucial to maintaining productivity and hiring success. I think we are in the middle of this shift right now.

Having said that, I hope this isn’t a new era of “Interview 2.0” questions because you know… all software engineers need to be able to tell you how to get 4 gallons of water using only a 3 and 5-gallon jug or to estimate the number of trees in Central Park. Although, I would rather do that than have a week-long programming assignment to prove I know how to program. Trust me, I have declined many of those because it’s like I have an infinite amount of time in my day and love doing work for free.

Public Speaking and Open Source May Hold the Answer

So what do we do about this particular problem?

To address the challenge of validating skills without extensive in-person interviews, companies/interviewers may want to turn to alternative proof of competencies, such as public speaking engagements and, for the tech world, their open-source contributions. Reviewing a candidate’s GitHub activity, technical blog posts, or recorded presentations can provide valuable insights into their expertise and problem-solving abilities.

GitHub Contributions

Although I never really looked at user content 7-10 years ago, I did look at GitHub and open source contributions on other platforms. With AI being able to generate code in any language these days, there is something to be said about supporting a product or an open source project. When a user/customer reports an issue, the project maintainer must triage, root cause the problem, and interact with another human being. This speaks volumes.

GitHub Contributions

Similarly, public speaking appearances or videos posted on social media platforms like LinkedIn, YouTube, etc, at industry events or webinars allow recruiters to see how well candidates can articulate complex concepts. At the end of presentations, there is inevitably a Q&A session where they aren’t going to be able to use ChatGPT to answer a question live in-person. These in-person examples or recorded sessions provide a more authentic measure of skill and commitment than a polished resume ever could.

The Full Stop Thought

So, where do we go from here? We are seeing some of these changes happen in recruiting today. I have heard of interviews where a link kicks off a recorded session, and you, as the interviewer, are presented with questions to answer on video for review later. I don’t know how effective this is, but I have heard of this happening. Is this a good solution? It sounds horrible if you ask me, but change is happening.

As someone who has been on both sides of the fence, the challenges in hiring today are interesting and unique, to say the least. However, there is something to be said for verifiable contributions, like GitHub or posted videos. As someone who thinks social media has done a number on society and who only has socials for work-related purposes only, I came to one possible answer… this content can provide a window into someone’s vested interest in topics they chose and how they demonstrate understanding of that topic.

Until next time!

There has been a huge push to take containers to the next level by twisting them to do much more. So much more in fact that many are starting to use them in ways that were never originally intended and even going against the founding principles of containers. The most noteworthy principle of containers being left on the designing room floor is without a doubt is being “stateless”. It is pretty evident that this trend is only accelerating… just doing a simple search of popular traditional databases in Docker Hub yields results like MySQL, MariaDB, Postgres, and OracleLinux in Docker Hub (Oracle suggests you might try running an Oracle instance in a Docker container. LAF!). Then there is all the NoSQL implementations like Elastic Search, Cassandra, MongoDB, and CouchBase just to name a few. We are going to take a look to see how we can bring these workloads into the next evolution of stateful containers using the Mesos Elastic Search Framework as a proposed model.

The problem with stateful containers today is that pretty much every implementation of a container whether its Docker, Apache Mesos, or etc has been architected with those original principles, such as being stateless, in mind. When the container goes away, anything associated with the container is gone. This obviously makes it difficult to maintain configuration or keep long term data around. Why make this tradeoff to begin with? It keeps the design simple. The problem is useful applications all have state somewhere. As a response, container implementations enabled the ability to store data on the local disks of compute nodes; thus tying workloads to a single node. However on the failure of a particular node, you could potentially lose all data on the direct attached storage. Not good for high availability, but at least there was some answer to this need.

Enter the Elastic Search Mesos Framework

This brings me to a recently submitted Pull Request to the Elastic Search (ES) Mesos Framework project on GitHub to add support for External Storage orchestration, but more importantly to enable management of those external storage resources among Mesos slave/agent nodes. Before I jump into talking about the ES Framework, I probably should quickly talk about Mesos Frameworks in general. A Mesos Framework is a way to specialize a particular workload or application. This specialization can come in the form of tuning the application to best utilize hardware, like a GPU for heavy graphics processing, on a given slave node or even distributing tasks that are scaled out to place them in different racks within a datacenter for high availability. A Framework is consists of 2 components a Scheduler and an Executor. When an resource offer is passed along to a Scheduler, the Scheduler can evaluate the offers, apply its unique view or knowledge of its particular workload, and deploy specialized tasks or applications in the form of Executors to slave/agent nodes (seen below).

The ES Framework behaves in the same way described above. The design of the ES Scheduler and Executor have been done in such a way that both components have been implemented in Docker containers. The ES Scheduler is deployed to Marathon via Docker and by default the Scheduler will create 3 Elastic Search nodes based on a special list of criteria to meet. If the offer meets that criteria, an Elastic Search Executor in the form of a Docker container will be created on the Mesos slave/agent node representing the output for that offer. Within that Executor image holds the task which in this case is an Elastic Search node.

Deep Dive: How the External Storage Works

Let’s do a deep dive on the Pull Request and discuss why I made some of the decisions that I did. I first broke apart the OfferStrategy.java into a base class containing everything common to a “normal” Elastic Search strategy versus one that will make use of external storage. Then the OfferStrategyNormal.java retains the original functionality and behavior of the ES Scheduler which is on by default. Then I created the OfferStrategyExternalStorage.java which removes all checks for storage requirements. Since the storage used in this mode is all managed externally, the Scheduler does not need to take storage requirements into account when it looks at the criteria for deployment.

The critical piece in assigning External Volumes to Elastic Search nodes is to be able to uniquely associate a set of Volumes containing configuration and data of each elastic search node represented by /tmp/config and /data. That means we need to create, at minimum, runtime unique IDs. What do I mean runtime unique? It means that if there exists a ES node with an identifier of ID2, there exists no other node with an ID2. If an ID is freed lets say ID2 from a Mesos slave/agent node failure, we make every attempt to reuse that ID2. This identifier is defined as a task environment variable as seen in ExecutorEnvironmentalVariables.java.

addToList(ELASTICSEARCH_NODE_ID, Long.toString(lNodeId));
LOGGER.debug("Elastic Node ID: " + lNodeId);

private void addToList(String key, String value) {
  envList.add(getEnvProto(key, value));
}

Why an environment variable? Because when the task and therefore the Executor is lost, the reference to the ES Node ID is freed so that when a new ES Node is created, it will replace the failed node and the ES Node ID will be recycled. How do we determine what Node ID we should be using when selecting a new or recycling a Node ID? We do this using the following function in ClusterState.java:

public long getElasticNodeId() {
    List taskList = getTaskList();

    //create a bitmask of all the node ids currently being used
    long bitmask = 0;
    for (TaskInfo info : taskList) {
        LOGGER.debug("getElasticNodeId - Task:");
        LOGGER.debug(info.toString());
        for (Variable var : info.getExecutor().getCommand().getEnvironment().getVariablesList()) {
            if (var.getName().equalsIgnoreCase(ExecutorEnvironmentalVariables.ELASTICSEARCH_NODE_ID)) {
                bitmask |= 1 << Integer.parseInt(var.getValue()) - 1;
                break;
            }
        }
    }
    LOGGER.debug("Bitmask: " + bitmask);

    //the find out which node ids are not being used
    long lNodeId = 0;
    for (int i = 0; i < 31; i++) {
        if ((bitmask & (1 << i)) == 0) {
            lNodeId = i + 1;
            LOGGER.debug("Found Free: " + lNodeId);
            break;
        }
    }

    return lNodeId;
}

We get the current running task list, find out which tasks have the environment variable set, build a bit mask, then walk the bitmask starting from the least significant bit until we have a free ID. Fairly simple. As someone who doesn’t run Elastic Search in production, it was pointed out to me this would only support 32 nodes so there is a future commit that will be done to make this generic for an unlimited number of nodes.

Let’s Do This Thing

To run this, you need to have a Mesos configuration running 0.25.0 (version supported by the ES Framework) with at least 3 slave/agent nodes, you need to have your Docker Volume Driver installed, like REX-Ray for example, you need to pre-create the volumes you plan on using based on the parameter --frameworkName (default: elasticsearch) appended with the node id and config/data (example: elasticsearch1config and elasticsearch1data), and then start the ES Scheduler with the command line parameter --externalVolumeDriver=rexray or what ever volume driver you happen to be using. You are all set! Pretty easy huh? Interested in seeing more? You can find a demo on YouTube located below.

BONUS! The Elastic Search Framework has a facility (although only recommended for very advanced users) for using the Elastic Search JAR directly on the Mesos slave/agent node and in that case, code was also added in this Pull Request to use the mesos-module-dvdi, which is a Mesos Isolator, to create and mount your external volumes. You just need to install mesos-module-dvdi and DVDCLI.

The good news is that the Pull Request has been accepted and it is currently slated for the 8.0 release of Elastic Search Framework. The bad news is the next release looks like to be version 7.2. So you are going to have to wait a little longer before you get an official release with this External Volume support. HOWEVER if you are interested in test driving the functionality, I have the Elastic Search Docker Images used for the YouTube video up on my Docker Hub page. If you want to kick the tires first hand, you can visit https://hub.docker.com/r/dvonthenen/ for images and instructions on how to get up and running. The both the Scheduler image (and the Executor image) were auto created as a result of the gradle build done for the demo.

Frameworks.NEXT

What’s up next? This was a good exercise in adding on to an existing Mesos Scheduler and Executor and the {code} team may potentially have a Framework of our own on the way. Stay tuned!

The Full Stop Thought

Talking tech stuff (and not)…

Tag Archives: GitHub

Beyond the Resume: Speculative Hiring Trends in an AI World

How LinkedIn Connects Candidates to Employers

AI Will Kill the Resume

How Do You Prove Competence?

Public Speaking and Open Source May Hold the Answer

The Full Stop Thought

Enabling External Storage on Mesos Frameworks

Enter the Elastic Search Mesos Framework

Deep Dive: How the External Storage Works

Let’s Do This Thing

Frameworks.NEXT

Follow Me