TippyBits

How to get a Software Engineering role? Don’t.

David Tippett — Mon, 30 Jun 2025 14:30:00 GMT

You graduated with your Computer Science degree and are wondering, “How do I get a software engineering job?”. Don’t… I was talking with someone a few weeks ago at an Elastic DevCon event. She’s a recent grad and was sharing about how challenging it is to find a role right now. I don’t think she‘s wrong which is why I typically advise against looking for software engineering roles…

I don’t mean you shouldn’t look for a job. The role, “software engineer” is super generic. Out of college, it’s tempting to look for a generic role because you may not know what you want to specialize in. There are dozens if not hundreds of different areas to specialize in after all. But, if you can find a speciality your chances of getting hired go up massively.

Specializations Within Software Engineering

So let’s go through a few of them! I pulled together a broad list of specializations that you may come across. Within each of these there may be sub-areas you can also specialize. For example, in DevOps you may specialize in scaling, configuration management, or provisioning.

Infrastructure & Tooling

DevOps / Platform Engineer – Builds CI/CD pipelines, manages infra as code, and improves deployment velocity.
Build Systems Engineer – Designs and maintains systems like Bazel, Buck, or Gradle for efficient builds.
Developer Productivity / DX Engineer – Improves tools and workflows that other developers within your company use.
Test Automation Engineer – Creates and maintains automated testing frameworks for CI and release readiness.

Systems-Level Engineering

Operating Systems / Kernel Engineer – Works on OS internals, drivers, and low-level performance tuning.
Embedded Systems Engineer – Writes software for devices with constrained resources like microcontrollers.
Networking / Network Automation Engineer – Automates infrastructure (e.g., routers/switches) or develops network-heavy systems.

Data

Search Engineer – Designs and tunes search relevance, indexing pipelines, and vector search systems.
Data Engineer – Builds pipelines and transforms data for analytics, ML, and operational needs.
Database Tooling Engineer – Builds and maintains custom database infrastructure, schema migration tools, etc.

Software Application Specialties

Backend Engineer – Focuses on APIs, services, business logic, and database integration.
Frontend Engineer – Specializes in building responsive and interactive user interfaces.
Mobile Developer – Builds apps for iOS/Android, often with platform-specific constraints.
Integrations Engineer – Connects systems via APIs, webhooks, or custom connectors.

Advanced / Niche Roles

Programming Language Specialist – Works on compilers, interpreters, or language specific tooling.
Observability Engineer – Specializes in metrics, logging, tracing, and incident diagnostics.
Security Engineer – Focuses on application security, auditing, threat modeling, and secure coding practices.
Performance Engineer – Analyzes and improves latency, throughput, and resource usage across systems.

Cross-Functional & Adjacent Roles Using Software Skills

While these roles won’t always involve writing code they benefit from a familiarity with software engineering. I find that a lot of these are great when paired with your other non-tech skills like interpersonal or organizational skills.

Product, Marketing & Customer Engagement

Product Manager (Technical) – Translates user needs into engineering work, often with a computer science background.
Developer Advocate / DevRel – Bridges engineering and marketing, creating content and engaging developers.
Technical Writer – Writes documentation, tutorials, and internal guides for technical products.
Sales Engineer / Solutions Architect – Supports sales by explaining technical concepts and customizing demos.
Technical Account Manager – Supports large customers with the ongoing use and adoption of some software product.

Support, Ops & Strategy

Support Engineer – Provides technical support, triages bugs, and works with engineering on fixes.
Site Reliability Engineer (SRE) – Ensures uptime and reliability of systems, often blending dev + ops.
Systems Administrator – Manages servers, users, backups, and local IT needs—especially in hybrid setups.
IT Automation Engineer – Automates enterprise software provisioning, access controls, and device management.
Data Analyst / BI Engineer – Uses SQL, dashboards, and scripts to help teams make data-driven decisions.
Technical advisor - typically helps build analyses for executives to help guide strategy.

Special Projects & Hybrid Roles

Internal Tools Developer – Builds bespoke software for non-engineering teams (e.g., finance, HR).
ML Ops / AI Infrastructure – Supports deployment, monitoring, and tuning of machine learning models.
Research Software Engineer – Bridges academia and industry, supporting research with production-grade code.
Digital Humanities / Scientific Computing – Applies software engineering to the arts or sciences.

What Next?

Now that you’ve looked over the list, it’s time to explore! Pick a role or two and start finding out what's involved with working in those roles. Many of them you can explore with a computer or the phone you are reading this on. Each niche has discord and slack communities that you can join to ask questions. Often they also have meetups where people get together to talk about them.

Find someone working in one of these fields and get a virtual coffee with them! The best way to find a role is to get involved in any way you can. Best of luck as you are finding your niche.

Haystack 2025 Takeaways

David Tippett — Mon, 28 Apr 2025 15:12:45 GMT

There were two prevailing themes during Haystack this year: vector search infrastructure and relevance. As an industry, we’ve recognized that vector search removes much of the intense work that was previously needed to build search. Synonym graphs and ontologies have given way to fine tuned embedding models. That is okay however because these systems are so new there is still a lot of engineering to make vector search stable and relevant.

Vector Search Infrastructure:

Everyone at the conference acknowledged the challenges with deploying vector search to production. Quantization is going to be a necessary technology to adopt in order for us to deploy vector search at scale. We need to be careful however because our lack of qualitative metrics will make it easy to deploy vector search and tank relevance. For Elastic here were the types and their profiles:

Binary Quantization - 40x faster, 32x less memory but at the cost of potentially tanking relevance
Scalar Quantization (int4/int8) - 4x less memory with a minimal precision loss
Product Quantization (PQ) - up to 64x less memory but requires much more data engineering. Achieves the least precision loss however.
Better Binary Quantization (BBQ) - Binary quantization can have the highest compression (aside from PQ) however is typically very lossy. BBQ solves this problem by normalizing around a centroid. Doing this allows it to perform faster than PQ without the infrastructure and engineering requirements. The challenge is while it does have higher recall Elastic noted it’s precision is better when re-ranking the original vectors after retrieval.

Joelle Robinson and David Fisher probably discussing the new mentorship program

Relevance:

There were two primary patterns that were talked about in depth when it comes to relevance: re-ranking and precomputed enrichment.

Re-ranking

While this pattern has been around for a while it’s gaining focus again. With larger and larger vector data stores we need to rethink search so it can meet our performance goals. This is where re-ranking comes into play. It involves using a much more broadly scoped query to retrieve a list of documents that are “relevant enough”. The retrieval phase mentioned here is supposed to find all the documents that may be relevant at the cost of probably including many that may not be.

After retrieval is where we implement the re-ranking step. This step takes the initial document step and adds precision using a more costly ranking algorithm. Some common re-ranking strategies could be the following:

Exact kNN - Once you have the first ~1000 or so documents you can re-create a graph with them and perform exact kNN on them to find a better sort order.
Cross Encoders - With cross encoder models you can rescore each result by passing in the query and the sentences to the same model and produce a more accurate relevancy score. These are more expensive to run which is why we’d run them in a re-ranking step.
Personalization - Here we could use a two tower type approach to run personalization to further refine the results to be more close to what the current user expects.

Enrichment:

Data enrichment was one of the more bleeding edge practices beginning to gain traction. Data enrichment involves adding value by either aggregating or enriching documents to make them easier to retrieve or utilize in a generative workflow. Here were a few of the particularly interesting use cases that were seen:

Generate questions that may be asked to retrieve that document. This can aid in retrieval for documents. Particularly if they are very long.
Summary generation for LLM’s to utilize
Classification for better retrieval (eg. an issue may have a solution or not)

Froh and Sarah's frameworks side by side as they hot swapped parts.

Bleeding Edge:

There were three things I saw used at the conference that I feel are on the bleeding edge for search. First and most accessible was an ML based boosting method for weighting hybrid queries. This talk by Daniel Wrigley took in features about the query in order to generate an optimal set of weights for the query. It looked at the number of tokens, average token length, etc to determine whether to bias to semantic or lexical retrieval. A sample of this workflow can be found in the “learning-to-hybrid-search” repo.

In a talk by Vespa they introduced a concept called: mapped tensor boosting. In this method they apply boosts to known tensors allowing for a more explainable search experience. An example of this could be boosting cars who have “hatchback” as a tensor value. While this boosting method is actually old in the space of the larger search engines like Yahoo and Google, it has still yet to gain mainstream popularity.

Finally, user behavior insights (UBI) is the last bleeding edge bit of technology. Again clickstream logging has been around for a long period however it’s lacked a standard. With the addition of a standard clickstream based workflows can begin to be standardized.

The Last Business Review

David Tippett — Mon, 10 Jun 2024 14:41:10 GMT

This week there is a lot of excitement! I had one lead come from some of my outreach opportunities and I have a good feeling that I am starting to narrow down on a solid target market. This is thanks to a bit of help from Jonathan Stark! We had a great talk last week where he helped me work out what it is I am actually providing customers with because infrastructure doesn't sell.

Business:
- I had one lead come from some outreach I did and sent out 10 other highly targeted outreach messages
- On track to monthly goals with 1 external blog published, 1 blog published on my site, and 1 video that is launching today!
Content:
- Syndicated my vector blog and video on OpenSearch's blog which lead to a huge uptick in traffic
- Published "What no one told you about vector search"
- Pushed video on assembling my compute blade cluster live today! Putting that into today's WBR as it was a large part of my time last week
- Syndicated blog pushed to the OpenSearch website
Social:
Why am I doing anything other than YouTube? Well it doesn't pay just yet. I am down -21% in viewership WoW but that is still my largest traffic source (671 views). Subscriber count is up 15% though (78-89 subs).
Website up to 229 users (+316%) showing that content syndication is the key to maintaining traffic at these early stages
Twitter saw a noticeable uptick in impressions (4234 impressions, +70%) because of two stupid tweets:

Wait, what did I miss?
— David Tippett (@dtaivpp) June 4, 2024

You want job security? Get into search relevance. There's like 1000 people in the search community and they all fit in one slack channel 😅
— David Tippett (@dtaivpp) June 9, 2024

LinkedIn has steadily chugged on (impressions +30% WoW, followers +1.5% WoW). It remains my biggest website traffic driver with 56% of my site traffic coming from LinkedIn in the last week

Strategy:
- This week I've pivoted (surprise) and am doing a lot more outreach at the advice of Jonathan. I've avoided outreach until this point because I didn't have a specific enough target audience. Now that I've honed down on that I made some changes to my site, LinkedIn bio, and twitter bio to better reflect what I am doing.

What no one told you about vector search

David Tippett — Wed, 05 Jun 2024 18:45:51 GMT

My background was never search or relevance engineering. For the ~7 years before I started working with OpenSearch I had been doing DevOps, data engineering, and software engineering. That's why I had such a hard time when vector search started to become so popular. It felt like every time I felt I knew what was happening my understanding was shaken up. Turns out it's not just me. I've talked with countless engineers recently who have hit some of the same roadblocks that I'll be mentioning.

While I can't share everything I've learned in just one blog here let's go through some of the top issues I've seen over and over. These are the ones that would typically stop a company from ever adopting vectors into their search workflow.

Machine learning models have a limited input

I was just talking to a company a few weeks back about their search use case. When I asked them if they had tried semantic search they said they had and the results were tragic. Shocked to hear this I asked why they thought it didn't work. From my perspective, I felt their use case was a perfect fit for vector search.

After digging in, I found that they were embedding documents that were tens of thousands of words long. This may seem reasonable until we start to look at how vector embeddings work in general and then specifically in OpenSearch.

When we embed documents we send them through a machine learning (ML) model. These models take in a fixed number of tokens and will output a representation as a vector of a fixed size. Tokens in most cases are just words separated by spaces. ColBERTv2.0 for example takes in 512 tokens by default and outputs a vector with 768 floating-point numbers.

So how does OpenSearch handle text longer than the input maximum for a model? It truncates it. By this, I mean if you ingest a 1000-word document and the model only supports 512 tokens OpenSearch will simply cut your document down to 512 tokens and discard the rest. This is why the above company had bad relevance when they were embedding their hyper-long documents. Their documents were being cut down to the first 512 tokens and the remaining was discarded.

The right way to handle documents with longer inputs depends on how you plan to query them. No matter what you do though I feel like cutting documents without alerting the user is bad UX. This is why I've laid out the following proposal to change this pattern in OpenSearch. Check it out and please comment your thoughts!

[FEATURE] Fail documents who’s embedding field is larger than the token limit of an embedding model · Issue #2466 · opensearch-project/ml-commons

Is your feature request related to a problem? Yes! I just had a discussion with @ylwu-amzn where we were discussing how documents are embedded. I (and many others I have talked to) were under the i…

GitHubopensearch-project

Handling longer documents

So with a fixed number of input tokens, how can these models embed longer documents? There are a few strategies we will explore here. Which you end up choosing depends on how you plan to access the documents. I think this will make a bit more sense as we start working through each of the scenarios.

You need chunks! Chunking is when we cut our documents into pieces. While this may seem like a simple task there are quite parameters to tune here. How big are our our chunks? Do they overlap? By how much? How do we represent these in OpenSearch?

Let's start with your goals for these documents. What do you plan on doing with them? For example, the company we mentioned above helped their users find books matching their search. Others may be looking to return just the most relevant chunk of a document for use in a retrieval augmented generation (RAG) pipeline.

Strategies for vector ingestion

When your goal is to retrieve a whole document I'd recommend using OpenSearch's ingestion pipeline for chunking documents. This method will chunk the target field from the document into multiple vector embeddings. These will be stored in a nested field. This method comes with a few distinct advantages. With the score mode as "max" we can find the document that has the highest matching passage. We can also use "avg" to see which document has the most relevant chunks related to the query.

GET index_name/_search
{
  "query": {
    "nested": {
      "score_mode": "avg",
      "path": "passage_chunk_embedding",
      "query": {
        "neural": {
          "passage_chunk_embedding.knn": {
            "query_text": "document",
            "model_id": "-tHZeI4BdQKclr136Wl7"
          }
        }
      }
    }
  }
}

Again this is the strategy I would recommend for companies who's goal is to return the whole document. It's going to dramatically simplify your ingestion setup and your relevancy scoring. For people interested in returning the individual parts of a document for a RAG type use-case I am going to recommend using an external chunker such as Haystack's document splitter or LangChain's text splitter. These will allow you to split the documents into individual chunks and store them as separate documents. This is because the goal of RAG is to find the most relevant chunk to provide context for LLM generation.

The final boss - capacity planning

Once you have your strategy for ingesting vectors planned you need to start capacity planning. Vectors operate a bit differently in OpenSearch than typical documents. First off, embedding and ingesting embeddings takes significantly longer than ingesting text. It's important test and see how long this process takes for your documents.

Your nodes also need to be right-sized for vector search. Assuming you are using HNSW, which I feel most people will be, all of the vectors are going to be stored in memory. Below is the rough calculation for memory use with HNSW. This is with the out-of-the-box configuration.

1.1 * (2 *  + 128) *  ~= Bytes

# So for a 512 dimension vectors with 1,000,000 vectors:
1.1 * (2 * 512 + 128) * 1,000,000 ~= 1267200000 Bytes or 1.26GB

I'd recommend using memory-optimized instances for vector search. In AWS these are the "R" instances. One of the interesting things about vectors in OpenSearch is they are loaded in memory space outside of the JVM. So if 50% of your memory is dedicated to the JVM as typically recommended then you will have the remaining 50% for the OS and vector storage.

Is that it?

Not even close! There is so much to know when it comes to vector search. The things I've covered here are just enough to help you get a solid start. There is still model fine-tuning, relevance engineering, hybrid search, and the list goes on. For a deeper dive on vectors in OpenSearch check out this fantastic blog from AWS.

If your company would like to evaluate vector search I am here to help! I've worked with dozens of customers while working at AWS, get started with vector search and I'd love to help you too! Schedule some time with me for your free consultation!

Should you be using semantic search?

David Tippett — Wed, 15 May 2024 14:30:16 GMT

I’ve been getting this question a lot recently. It’s a fair question especially as people have been using the same tools for search for a long time. I’ll walk you through some of the challenges with search as it currently exists. Then I’ll show why and how to use semantic search in OpenSearch. Finally, we will cover some of the potential challenges with using vector search. While it has quite a few upsides they come at a cost.

OpenSearch's default search

OpenSearch by default is a lexical document store. This means it takes keywords from a query and tries to find them in the documents from your index. It can enrich documents using different analyzers like the english language analyzer. This one does stemming, where it evaluates run and running to the same word as they have the same root.

As OpenSearch finds documents with words matching your query it begins to calculate a relevance score. The score is based on how many times the words appeared in any document and how many other documents contain that word. This helps to filter out common words like the or and which exist in many documents.

These techniques have been the foundation of the search industry for years now. They do poorly when we have words that may be semantically similar like sprint and run. We may know intuitively that these words are similar. Unless we create some fairly complicated synonym rules, documents with run won’t be returned if we have sprint in our query.

Even with synonym rules lexical search engines still struggle with context. Say we have a document with the phrase racecar driver and another with golf driver. If we search for supercar driver both of those documents will be returned because the term driver shows up in both.

With semantic search

How embedding models are related to vector search.

This is where semantic search comes into play. With machine learning (ML) models, we can create a representation of what a document means. The representation is encoded as a vector (an array of floating point numbers). Vectors are the way to represent and compare the outputs of these ML models. As shown above, many different types of models may output vectors that can be searched and compared to each other.

A depiction of what happens to text as it's embedded.

Using one of these semantic models we encode our documents as a vector. Above I have shown an example of what the output might look like. Even though each of the inputs are different lengths they are output to a consistent sized array of floating numbers between 0 and 1. Since there are 3 floating points per array you would say this is a 3 dimensional vector. When we run this we will use msmarco-distilbert which outputs vectors that have 768 dimensions.

Next, I'll plot these vectors so we can see what they might look like. OpenSearch uses some fancy linear algebra to compare vectors. This happens with functions like cosine similarity or dot product. I don't think there is any way to represent what's happening in a nice visual way so we will simplify it with the below plot.

A representation of what vectors look like plotted on a graph.

I've also used the words next to the points to show the original documents these represent. As we can see words that are similar share dimensions. For example, it seems things related to golf share the 0.3 dimension on the X axis, whereas things related to driving are sharing a 0.5 dimension on the Y axis.

Now if we were to execute our search for supercar driver we would need to run the search phrase through our embedding model. This may look like [0.47, 0.5, 0.01]. This embedding will then be compared with the other vectors to find the ones that look most like it. Now we can see that our document racecar driver is very similar to supercar driver.

A graph showing the query for supercar driver plotted relative to documents in our index.

So semantic search is used when your users might not know the exact words they need to use to find the documents they want. Question-answering systems will likely be the type of systems where adding semantic search will improve the quality of result matches. These are the types of systems where people are asking questions and may not use the exact words in the documents.

Creating a semantic search experience in OpenSearch

The best way to get started is to use the OpenSearch connectors. These will allow you to build an ingestion pipeline that will automatically create embeddings for data as it’s sent into your cluster. These examples should be run from the OpenSearch devtools.

First, we will set up the cluster to support ML workloads and allow access to connectors. Below are the settings that we need to enable to run models in OpenSearch.

PUT _cluster/settings
{
  "persistent": {
    "plugins": {
      "ml_commons": {
        "allow_registering_model_via_url": "true",
        "only_run_on_ml_node": "false",
        "model_access_control_enabled": "true",
        "native_memory_threshold": "99"
      }
    }
  }
}

Next, we create a model group for access control. Save the Model ID for later

POST /_plugins/_ml/model_groups/_register
{
    "name": "Model_Group",
    "description": "Public ML Model Group",
    "access_mode": "public"
}
# MODEL_GROUP_ID:

Here is a sample using msmarco-distilbert as a local embedding model. This is going to be one of the most straightforward ways for us to test this. In the below make sure to past the model group ID we saved above. Now we are going to register the model into our cluster. This downloads the model for use locally. Remember to save the task ID generated here for the next steps.

POST /_plugins/_ml/models/_register
{
    "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
    "version": "1.0.2",
    "model_group_id": "EPBkeI8BFtCmbnffsBiF",
    "model_format": "TORCH_SCRIPT"
}
# TASK_ID:

Now we need to wait for the model to be pulled down and deployed. To check the status we can call the following:

GET /_plugins/_ml/tasks/
# MODEL_ID:

When the state changes to completed then the operation is complete and we can grab the Model_ID and deploy it in our cluster.

POST /_plugins/_ml/models//_deploy
# TASK_ID

This will result in a new task ID that we will use to check if the model gets deployed. Once this task shows as complete then we will know the model is ready for use.

GET /_plugins/_ml/tasks/

We can test the model by trying to embed some text:

### Testing the Embedding Model
POST /_plugins/_ml/_predict/text_embedding/
{
  "text_docs": ["This should get embedded."],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}

If that returns a large array of floating point numbers we are in a good state! Now we can use the model in our ingestion and search pipeline. Below embedding-ingest-pipeline is just an arbitrary name for the pipeline. The important part is the field_map section. What happens in the ingestion pipeline is it will look for the key field, in this case content, and will send that to our ML model to get embedded. It will then store the result in the content_embedding field. The key field needs to match the name of the field you wish to embed in your source data but the value field there is arbitrary.

PUT _ingest/pipeline/embedding-ingest-pipeline
{
  "description": "Neural Search Pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "",
        "field_map": {
          "content": "content_embedding"
        }
      }
    }
  ]
}

Now we create a hybrid search processor. This will allow us to do both traditional keyword search, alongside vector search and combine the result sets in the end. Here hybrid-search-pipeline is the name of our search pipeline. It can be whatever you would like so long as you use the same name when connecting it with your index.

## Put the search pipeline in place
PUT _search/pipeline/hybrid-search-pipeline
{
  "phase_results_processors": [
    {
      "normalization-processor": {
        "normalization": {
          "technique": "min_max"
        },
        "combination": {
          "technique": "arithmetic_mean",
          "parameters": {
            "weights": [
              0.3,
              0.7
            ]
          }
        }
      }
    }
  ]
}

Now we can finally create an index mapping for our documents! I am saying index mapping as this simply specifies how certain fields should be mapped when being pushed into our OpenSearch index. Again documents is an arbitrary index name picked for this demo.

PUT /documents
{
    "settings": {
        "index.knn": true,
        "default_pipeline": "embedding-ingest-pipeline",
        "index.search.default_pipeline": "hybrid-search-pipeline"
    },
    "mappings": {
        "properties": {
            "content_embedding": {
                "type": "knn_vector",
                "dimension": 768,
                "method": {
                    "name": "hnsw",
                    "space_type": "innerproduct",
                    "engine": "nmslib"
                }
            },
            "content": {
                "type": "text"
            }
        }
    }
}

Finally, you should be able to insert documents into your index! I would recommend using the `_bulk` endpoint. It is the most straightforward way to send documents into an index in OpenSearch. Below is an example of what a bulk request looks like. The bulk endpoint expects newline delimited JSON. The first line is the action you wish to perform, and the second line is the document for that action to apply to.

POST /documents/_bulk
{ "index": {"_id": "1234" } }
{ "content": "There once was a racecar driver that was super fast"}
{ "index": {"_id": "1235" } }
{ "content": "The golf driver used by tiger woods is the TaylorMade Qi10 LS prototype"}
{ "index": {"_id": "1236" } }
{ "content": "Some may say that supercar drivers dont really mind risk"}

Once your data is in we can finally pull it all together with a search.

GET /documents/_search
{
  "_source": {
    "exclude": [
      "content_embedding"
    ]
  },
  "query": {
    "hybrid": {
      "queries": [
        {
          "match": {
            "content": {
              "query": "sports automobile"
            }
          }
        },
        {
          "neural": {
            "content_embedding": {
              "query_text": "sports automobile",
              "model_id": "",
              "k": 5
            }
          }
        }
      ]
    }
  }
}

Finally, here are the results! None of the documents contained the word sports or automobile. Our search was able to determine that the first two documents were more related than the last one. In a production system, we would want to filter out results less than a certain threshold.

  "hits": [
    {
      "_index": "documents",
      "_id": "1234",
      "_score": 0.7,
      "_source": {
        "content": "There once was a racecar driver that was super fast"
      }
    },
    {
      "_index": "documents",
      "_id": "1236",
      "_score": 0.17148913,
      "_source": {
        "content": "Some may say that supercar drivers dont really mind risk"
      }
    },
    {
      "_index": "documents",
      "_id": "1235",
      "_score": 0.00070000003,
      "_source": {
        "content": "The golf driver used by tiger woods is the TaylorMade Qi10 LS prototype"
      }
    }
  ]

Challenges with semantic search

Even though in this example we were able to see a huge benefit from semantic search not every use case will see the same results. The first challenge is when your business uses different terms and language than was represented in the training data. For example, in the world of search relevance, there are acronyms like bm25, tf-idf, and ndcg@10. These are industry specific and many language models won't know how to embed them. This could result in random documents being returned. One strategy is using a semantic model that has been fine-tuned on your data.

Another challenge with semantic search is it is more resource-intensive than lexical search. For example, HNSW is very memory intensive as all the vectors are stored in memory (check this memory sizing documentation). Additionally, ingestion takes longer as documents need to be run through the embedding models. Even with all this, several people will see huge benefits from adding vector search to their stack.

If you'd like help evaluating semantic search I am here to help! Check out my services page and schedule some time with me. I'd love to talk about what you are building and if vector search is a good fit for you!

Creating a Ghost site exposed with Cloudflare

David Tippett — Fri, 03 May 2024 14:55:44 GMT

I've used quite a few different methods for publishing blogs over the years. I helped build the OpenSearch blog on Jekyll, I've published using a content platform like Medium, and I've written pages using C# and HTML. So far Ghost has been my favorite. It's allowed me to move away from the tedious work of building pipelines for rendering my static pages and focus on writing.

Ghost is a content management system (CMS). That means I can add all sorts of dynamic elements such as comments, logins, etc. all from within Ghost. Not to mention they support integrations with payment providers allowing me to monetize some or all of my content.

Another thing that had kept me from hosting my own site for so long was not wanting to mess around with with SSL certificates or opening up ports on my firewall. This is where Cloudflared comes in. It's let me move the tricky work of DNS, SSL, and fire-walling onto a platform built by the experts in these spaces.

Before we get started

Sample architecture diagram for a Ghost site on an internal network

Above is a sample diagram of how I would deploy this in an internal network. Typically you will want to host externally accessible sites in a part of a network called a DMZ. In my example my DMZ is a VLAN separated from the rest of your network by a firewall that restricts it from accessing the other VLAN's. Since, Cloudflared is a reverse proxy it means we don't need to open ports on our networks firewall in order for outside users to get access to our site.

Server: You will need a server somewhere with internet access. I've seen people use a Raspberry Pi, EC2 instance, or an old laptop as a server for their site. You will want to set this server up with an operating system that specifically for the task. I'd recommend something like the most recent LTS version of Ubuntu Server if you are newer to Linux. This is the one that I will demonstrate today.

Once you have your server setup with Ubuntu you will want to install Docker Engine. Docker Engine does two things for us. Firstly it allows us to run containers which are a convenient way for us to run applications. Additionally, it comes with docker compose which is a utility that allows us to run several containers at once in an isolated network on our server. More on this later though.

Domain Name: This is how your users will access your site. Since we are using Cloudflare for our reverse proxy I'd recommend purchasing your domain through them. It makes setting up your site much more straightforward. I won't be covering this in depth here as there are plenty of guides on how to do this.

Configuring your site

To start I've created a sample repo (linked below) that can help get you going quickly with deploying your website. I'd recommend forking it to your own GitHub profile so you can modify it to fit your needs.

GitHub - dtaivpp/ghost-starter: Code for deploying a Ghost site with Cloudflared

Code for deploying a Ghost site with Cloudflared. Contribute to dtaivpp/ghost-starter development by creating an account on GitHub.

GitHubdtaivpp

Once you have forked it and cloned it down to your server you will need to make the bootstrap file executable. This allows us to run the script.

cd ghost-starter
sudo chmod +x boostrap.sh
./boostrap.sh

After running this script you will notice a few files have been created:

Inside the MySQL password and root password files you will find auto-generated passwords. You do not need to use these passwords however they tend to be harder to crack than manually created passwords.

Next, we will want to look at the .env file. It contains several configuration parameters we will want to set. The first will be the URL for your domain. This will effect where relative links go. When testing you will likely set this to something like http://localhost:2368 and for production you will want to set this to your domain name like: https://tippybits.com .

For our MySQL user this could be whatever you want to name your database user. Since I am just using this database for Ghost I will go with ghostdb here. Then the last setting I will talk about in this section is the DEPLOYMENT flag. This is how we will switch from a development Ghost instance to a production one.

# Change the domain to your website or localhost if testing.
URL=

# Could be any user youd like
MYSQL_USER=ghostdb

# SMTP - Mail from is an address you have permissions to mail from. Below is just a demo. 
# SMTP User - Username provided by your SMTP service
MAIL_FROM=noreply@
SMTP_USER=

# Set between production and development
DEPLOYMENT=development

Mailserver: This step is optional but if you'd like users to be able to login, comment, you will need to configure the SMTP settings. You can use Mailgun for this. I am not going to go through the whole process of how to setup Mailgun with your domain as they have guides that help with this when you create your account. What I will tell you is how to find your SMTP credentials once you've setup your domain to send and receive mail from Mailgun.

Go into your account and go to "Sending" on the left bar. Then select "Domains" and pick your domain from the list. Then you can go to the left bar again and select "Domain Settings" and you will see the below window. From here we will select SMTP and grab those settings and input the username into the .env file in the SMTP_USER field. This will probably be something like postmaster@. The password will go into the /secrets/SMTP_PASS.txt file we created earlier.

Cloudflared: I debated if I should put this in the configuration section because it's bound to cause some confusion. Let me be clear here though, do not store your cloudflared token in your development environment. Only put the token on your production server. Otherwise your local development environment will become your production.

With that out of the way you will need to go to https://one.dash.cloudflare.com/ to login to your Cloudflare Access™ dashboard. Once you've logged in you will go to the left bar and select "Networks" -> "Tunnels". Then we will select "Create a tunnel" and "Cloudflared". Then you will name your tunnel. I kept it simple here and used sitename-ghostcms. Next you will get a screen where you can copy your tunnel deployment. I selected the docker option and copied it. Do not run this command. We are going to paste it onto a note pad or something and pull the token out.

docker run cloudflare/cloudflared:latest tunnel --no-autoupdate run --token IjoiMTwtafwefawefweflktaelgrZTQ2NmJkYjA2YWY2YmE0OGIiLCJ0IjoiOTViNGU1OGQtNWQ1ZS00YmQ0LTg5ZGItYTdmYWI5NDQwfgw34tk1ERXdZamRsT0RrdFpEZzRNQzAwT1RjMkxUaGlOVaw4ffaw43amd5WW1R

The above is what it will look like (no that is not a real token but you can try if you want). We are going to copy everything after --token and paste it into our production servers /secrets/CLOUDFLARE_TOKEN.txt file. This is the magic that will connect our Cloudflared container to our account. Finally, we will connect the tunnel to our domain. Below is the settings you will want to use with your own domain name of course.

This configuration step takes all traffic going to your domain, such as tippybits.com, and routes it through a tunnel managed by Cloudflare that ends up on your server via your cloudflared container. This container next routes the request to `http://ghost:2368` which is the local hostname for your Ghost container. It took me forever to understand this. If you use this setup with your domain you should be good to go.

Setting up Cloudflared to access your local docker container

Deploying

Phew. I feel like 99% of the work here is configuring your environment 😮‍💨. Now we can do the fun part. LETS DEPLOY! To run your website execute the following command:

docker compose up -d

This will download and start up all of the containers with the configuration we have provided. There are a few things I want to point you to while this is starting up. We have not touched the docker-compose.yml file until this point but its worth looking at now.

Ghost does not seem to support accepting configuration via both config files and environment variables. I elected to add the config via environment variables. I couldn't seem to find anyone who had done the same. Any configuration option you find on the ghost documentation can be represented with an environment variable flag. Here's how:

mail: {
  from: some@email.com
  transport: SMTP
  options: {
    service: Mailgun
    host: smtp.mailgun.org
  }
}

Above we can see a sample config. We can turn this into environment variable config by separating each of the elements with a double underscore. This allows us to set any option that you see in the documentation. The above looks like this in our docker compose environment config.

services:
  ghost:
    image: ghost:latest
    restart: always
    ports:
      - "2368:2368"
    environment:
      mail__from: some@email.com
      mail__transport: SMTP
      mail__options__service: Mailgun
      mail__options__host: smtp.mailgun.org

Hope this helps someone who was as lost as I was when Ghost was overwriting your config files.

Claim your site! Finally, we can navigate to your site at the domain name you specified in Cloudflare. You will want to quickly go to https://yoursite.com/ghost to create your account and claim your site. And that's all! Now you can enjoy your time publishing blogs and building your own brand. If you've enjoyed this post and want to keep up with my content hit that little sign up button below! Cheers 😊

My next steps after AWS

David Tippett — Mon, 22 Apr 2024 16:01:33 GMT

I spent the last 2 years working as the senior developer advocate for OpenSearch at AWS. I loved my role ♥️ The community was incredibly welcoming and I got to work with some of the most brilliant engineers. During my time there I presented 33 times in 7 different countries. I covered everything from GenAI to deploying open source OpenSearch on Kubernetes.

So why leave? Amazon has a serious return to office initiative. The pressure to relocate was relentless. The uncertainty about my position and the notifications I got about being "non-compliant" was unmanageable. I have two small kids, 1.5 and 4 years old, and most of my family and my wife's family live locally. There was just too much support for me here to be willing relocate. I strongly hold to the belief that my family comes first.

I'm not the one to typically leave before I have a plan for what's next. This time was different though. My last day was on Friday March 8th 2023. Without a plan in mind I left AWS. The first few weeks I spent interviewing with different companies, catching up with people, and praying for God to show me what He had in store for me. ~30 interviews and meetings later I still hadn't come across a role that I felt fit what I was looking for.

My in front the yurt we stayed in near Shenandoah

Naturally (or not) my next move was to take a vacation with my family. 😆 Not the move most people make when they are looking for work. During this time I reflected on what it is that I liked so much about my role at AWS. The truth is I loved sharing my expertise with others. I also am passionate about the OpenSearch community.

I’m Consulting!

After all that time I’ve decided that what works best for me is continuing to help others with OpenSearch! I’ll be taking what I’ve learned over the last two years and helping others get started on their journey with OpenSearch. My experience working with the different companies and tool sets within the OpenSearch community prepared me to best help you!

If you are looking for help with OpenSearch particularly with anything from configuring your cluster to generative AI I am here to help! I am doing a free consultation where we can talk about the initiatives your team has. We'll then create a plan to get you on a roll.

If you've enjoyed working with me or referenced some material I've created I need your help sharing! Starting out as a consultant is challenging and I won't be successful without the continued support of the community (that means you too!). With that you can signup to get all the latest blog posts on making OpenSearch work better for you. Thanks to everyone who's supported me so far and I'm looking forward to continuing to share this journey with you!

Recording media for OpenSearchCon 2023

Welcome to TippyBits!

David Tippett — Fri, 12 Apr 2024 14:32:20 GMT

After publishing my blogs for years to Medium.com I felt it was time for a change. I got started writing back in 2019 and at the time Medium was the low hassle way to publish blogs that could reach a great audience and have impact on people. I've come to dislike how they choose to distribute content or not so I am moving here!

So welcome to the new start for me! My name is David and I am a great many things. Here you will find blogs about open source software development, developer relations, search engineering, and even data infrastructure. Professionally, I spent 2 years at AWS as the developer relations lead for OpenSearch. There I was rebuilding the community that had been shattered by Elasticsearch's license change. With their return to office mandate however I decided to leave and plot my own path.

If you like the work I do you can subscribe if to stay up to date and receive emails when new content is published! Additionally, I am doing consulting in the areas of GenAI, search infrastructure (preferably related to OpenSearch), and developer relations.

There are plenty of other ways to stay involved as well! I run a YouTube channel, am active on Twitter, LinkedIn, and of course GitHub.

Mobile LiveStream Setup

David Tippett — Thu, 27 Jul 2023 19:33:45 GMT

On-the-Go Live Streaming: Essential Gear for Developer Advocates

What I use to stream on the go.

As a developer advocate I am often on the road and sometimes the stars align such that I need to do a live stream while I am out. It’s often enough to where I have a plan an not often enough to buy a whole separate kit for it. As a result I’ve designed my home kit around the idea that it needs to work well for streaming on the go.

Camera

Credit: 2022 Sony Electronics Inc.

I shoot on the Sony ZV-E10. It’s a common choice amongst people getting into this space as it is a light mirrorless camera that shoots 4k and has a solid auto focus. Additionally, it has a flip out screen which makes getting the framing right super easy when you are on the go.

The funny thing is when I am streaming I don’t shoot in 4k for a few reasons. First, hotel WiFi is never good enough to reliably shoot at that quality. Second, when I am on a live stream I will be using picture in picture mode so my head is in a small square at the bottom. Arguably the most practical reason is the ZV-E10 has a USB-C output it can use to stream (in 1080p) and that saves me from having to bring a HDMI capture device.

Lens

First Image: Stock 16–50mm lens Second Image: Sony 11mm wide angle

I’d argue that your choice of lens actually may be more important than the camera. The two photos above were taken using the default settings at 20 inches from my face. As you can see the stock lens has a really tight framing meaning you need to have the camera placed far away from your face to get a decent shot.

Credit: 2022 Sony Electronics Inc.

The lens I recommend to remediate this is the Sony 11mm F1.8 wide angle lens. As you can see even though it is a fixed lens (can’t zoom) it does really great at capturing not only you but your background as well. The 1.8 aperture making this a great lens for low light settings as well.

Microphone

The lens may be more important than the camera but the mic tops even that. If you think about it it doesn't matter how good you look if no one can hear you. This is where I am probably going to have the most disagreements with people on this blog because I went in a fairly non-standard mic.

There are lapel mics, shoe mounted mics, handheld mics and they all do things a bit differently. Many people recommend the Rode VideoMic Go as it’s a great passive mic (no power needed). I didn’t end up going with it however as I wanted something more flexible. The VideoMic Go is an directional mic meaning it captures audio just from in front of it. That is great but what if you wanted to capture ambient audio?

Credit: 2022 Sony Electronics Inc.

I went with the Sony ECM-B10. It’s an active microphone that allows you to either do directional audio capture or capture ambient sounds. Additionally, it has features to auto adjust volume, boost volume, add noise filters, and the list goes on. While it doesn't have the best shock mount to absorb noise from when I am holding the camera and recording I don’t need that. That is not how I use my camera most often.

Tripod

So Casey Neistat has a video about his bendy tripod which is “good” at a lot of things but not great at everything. I’ve taken a similar approach with my tripod. While it may not be the best at everything it does well for what I need it for and is flexible enough to meet a majority of my needs.

It’s the Pgytech Mantis Pod Pro. Honestly, I get more questions about this little tripod than any of my other pieces of kit in my setup. This thing feels like it has 100 different setups that it can do an the nice thing is it’s rigid. That means after you set it up it won’t be flopping over. It’s literally the spider monkey of camera stands and they need to give me a referral link because I recommend it so much.

Other bits and bobs

There are a few other things I carry but don’t really have much of an explanation for as they just do what they are supposed to. The first is the small rig cage. This is the metal exoskeleton that is around my camera. For the most part I use it as it helps protect the camera in my backpack or in the event of a drop. It has several other 3/8 and 1/4 mounting points that can be used to mount accessories like lights or mics.

Finally, I use the Apeture MC light as it offers a lot of flexibility. You can select different brightness, color, temperature, etc. It mounts either with the two magnets on the back (which are SUPER strong) or 1/4 inch mount. It’s a light and it does well at lighting things ¯\_(ツ)_/¯

So what should you get?

That comes down to how you want to use it. I use mine for both live streaming on the go, live streaming in my home, and recording footage for work. It fits a lot of different use cases but you may be able to get away with a much smaller setup. Wherever you can I would focus on convenience. If it’s too hard to setup then you won’t do it often.

Here is the overview of my kit along with cost as of 7/27/2023. Again I cant stress enough several of these items serve a dual purpose for me. I use the camera and lens in my home setup as well which is why I went and spent a bit extra on these.

Sony ZV-E10 (with kit lens)— $799
Sony E 11mm F1.8 — $549
Sony ECM-B10 (Mic) — $249
PGYTECH Mantis Pod Pro — $149
SmallRig Cage — $39
Aputure MC — $90
Total: $1875

So what do you think? Would you have done something different? Leave a comment or shoot me a tweet and tell me about your setup!

Developer advocacy isn’t a free vacation

David Tippett — Wed, 17 May 2023 14:05:53 GMT

It’s hard to deny that one of the highly visible parts of developer advocacy/evangelism is the travel. You see advocates flying to Asia, Europe, Africa, and the list goes on. Who wouldn’t want to be a part of that? Truth is there is a lot of unseen work that goes into these conferences and events.

While I don’t want to dissuade people from getting into advocacy because truth is, I love it. I want to pull back the curtain on all the hard work done by developer advocates (and evangelists) surrounding conferences.

Pre-conference:

Peter O'Neill’s CFP Planning Board

Before we even get into attending conferences we have to look over which conferences?! Above is Peter’s board for planning which conferences he’s submitting to. Here advocates have to pick which conferences best align with their companies goals. For example, for product feedback you may attend several smaller conferences with workshops. For product promotion, they may focus on larger conferences and trying to land keynote spots.

2–5 months before a conference they open up their call for proposals. This is where we have to make a pitch for a talk (that we may have not even written yet). The amount of work here varies but typically they require 100–250 words on the topic, title, and then a personal bio. The challenge here is not every talk we submit gets accepted. Marino shared his talk acceptance rate is around 28%. That’s pretty typical from what I’ve heard from other advocates. We may need to submit 3–4 talks to get into the conferences we want to attend.

https://twitter.com/virtualized6ix/status/1653365362761560065

Before the conference they send acceptances or rejections. For those with accepted talks the real work begins. Now we are on the hook to write a compelling talk. My first talk (which was a roughly 15 minute talk) took me nearly a month to write. Why? I needed to ensure my talk resonated with the audience. The words used need to be precise and accessible. For example, I wanted to describe something as simple and Kyle, my mentor and fellow advocate, pointed out how that could alienate people who were new to tech. Simple to us could be challenging for someone else.

Presentations often need imagery to make sense. Images have licenses that may exclude us from using them so we may need to create your own diagrams and images. If an advocacy team is well resourced they may have a design team they can work with. Even then the design team will need a good description of what we’re are looking for.

Many presentations also use code. Does the code run? Do people need to install a lot of prerequisites to get it running? How do we share it with the audience? Short links? QR Code? These are just a few of the considerations we make when writing a presentation.

Traveling:

Rainy drive to stay at a hotel outside NYC because of a cancelled flight.

This is one of those things that is nice and not. It’s great to get to experience new locations but getting to and from conferences can be a nightmare. Here are a few notable examples from my first year as a developer advocate:

New York — Flight home delayed 3 times and then cancelled. Stayed an extra night. I had gotten soaked in the rain and didn’t have another set of clothes because I only planned to stay one night.
Dallas — Flight delayed 3 times and then cancelled. Stayed an extra night and had to work out my own flight schedule. American Airlines was trying to schedule me to take 2 flights over the next two days to get home.
Prague — Flight delayed over 3 hours causing me to miss my connection turning ~15 hours of travel into 26 hours. Caught Covid coming home, had to quarantine, and missed my Daughters 3rd birthday party. 😢
Seattle — Almost missed my flight home because my alarm didn’t go off causing a lot of general panic and mayhem.

A small note here: cancelled flights rarely mean more time in that city. By the time they cancel the flight often you have already been sitting in an airport for 5+ hours. Rescheduled flights are often for early the next day, not leaving time to tour the city more.

Going to my hotel after ~25 hours traveling to Japan

Getting to and from the conference can be challenging so here are some ways to set ourselves up for success! About a month before the trip look over the information needed to travel. Where is the event, the hotel, how do we get there? Is physical currency needed? Do we have all the documentation ready (vaccines, visas, passport)? Going through this will save a lot of headaches.

At the conference:

During the conference we are working. That could mean, presenting, standing at our booth, talking about our product, or networking. Hopefully, we are able to attend some of the talks however there is not always time for that. Often after the conferences you will be going to dinner with the people you met at the conference or maybe to a networking event. For KubeCon, I spent almost 3 days from 8 am to 9–10 pm talking to people.

Tip: After you have talked to someone take a note of who they were, what you talked about, and what you want to follow up with.

Outside of the Golden Pavilion in Japan

Tip: I would recommend if traveling to in a city you want to tour book a day or two before your conference for sightseeing. That way if there are delays getting to the conference you still have a healthy buffer before missing any of the conference.

For my Japan trip, I spent 55 hours traveling, 32 hours doing conference activities, and half of that was over the weekend. Because of the time I spent I worked with my boss and was able to take some time to tour Japan outside of the work I did. Many other advocates that I’ve talked to shared that their bosses would make allowances for similar things. Traveling to and from a conference is work so its good to recognize that and take some time off to not burn out.

After the conference:

Even after the conference there are 3 things I do. First there is the follow up. This is where I do all the things I agreed to during the conference. This includes: sending docs, making introductions, look into an issues, etc. I go through this list several times over the next few weeks and try to action on or at least follow up with every item. This step is exceptionally important to build trust.

The second, is recapping product feedback. I talk to many of our users during conferences and its important that feedback makes its way back into the product.

Finally, we do a recap meeting. A recap meeting is where I try to summarize several of the things learned during the conference for the teams. These could be technical learning’s, industry trends, or strategies we could use to improve our efficiency. Additionally, I look over the conference as a whole. Was it a good spend of company money? How can I prove that. Would we do something different when attending in the future?

Developer advocacy isn’t a free vacation

Advocacy is not a free vacation. I spend around ~3 months a year away from my wife and two children. When working conferences it can be long hours and there is a lot of stress involved with presenting. All that to say, I love working as a developer advocate. I’ve met so many wonderful people (many pictured above) but it is far from a free vacation.

If you’d like to know more about advocacy and how you can get started check out this video I recently did about my journey into developer advocacy.

Using NVIDIA GPU’s with K3’s

David Tippett — Mon, 26 Sep 2022 01:04:23 GMT

I wish I didn’t have to write an article about this. It would be nice if Nvidia would (fully) open source their drivers so they could be a first class citizen on linux and in Kubernetes. That is not the case at the moment however so here is the journey you will need to take to get Nvidia GPU’s working with Linux (I am specifically on Ubuntu 22.04).

Installing Drivers

The first step is of course to install the appropriate drivers. I am going to assume you’d like to patch the drivers. Patching the drivers removes the artificial 2 transcode limit currently imposed on them. We will be building the drivers because why not.

Before we start this we need to ensure the nouveau driver is disabled. We are doing this because the Nvidia drivers are compatible with the K3’s modules we will be using later. If you would like expanded details on this step you can see this linuxconfig.org article where these instructions were pulled from.# Add Nouveau to the modprobe blacklist
sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"# If the module is in the kernel, disable it
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"# Persist these settings on each boot
sudo update-initramfs -u

Now we can start building and applying the drivers. To make the process more straightforward lets install dkms. This will help with collecting the required dependencies for building the drivers.sudo apt install dkms

Next, we will need to download and install the appropriate drivers. For a nicely updated list that shows the drivers and which patches are supported check out keylase/nvidia-patch (which is where the code from below was pulled from).# Create a driver directory
sudo mkdir /opt/nvidia && cd /opt/nvidia# Download the files (this example uses driver 515.57)
sudo wget https://international.download.nvidia.com/XFree86/Linux-x86_64/515.57/NVIDIA-Linux-x86_64-515.57.run# Make the build script executable
sudo chmod +x ./NVIDIA-Linux-x86_64-515.57.run# Build the driver
sudo ./NVIDIA-Linux-x86_64-515.57.run

After the driver is installed we should be able to check it with nvidia-smi which should yield an output like the following one:

Source: Me

Patching Drivers

Now we can patch the drivers so that the transcode limit is unlocked. To do this you need to have Git on your server (sudo apt install git). If you run into issues with the following steps you should check out the source repo: keylase/nvidia-patch.# Optional if you want to start in your home directory
cd ~# Clone down the patch scripts
git clone https://github.com/keylase/nvidia-patch.git# Navigate into the repo folder with the patches
cd nvidia-patch# Ensure the patch is executable
sudo chmod +x patch.sh# Execute the patch (needs to be done in a bash shell)
sudo bash ./patch.sh

Installing Nvidia Container Runtime

Nvidia’s container runtime is now included in their tools project. To do this we have to add the signing key to our systems package manager (apt).

# Adding the signing key to apt 
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \ 
  sudo apt-key add -

# Create a variable with our distribution string 
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

# Install the appropriate package list for our distribution 
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \ 
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list

Now that it has been added to our apt sources list we can run the following commands to install it.sudo apt-get update \
&& sudo apt-get install -y nvidia-container-toolkit

After the install completes you have successfully installed the container runtime! We are almost at the finish line! Up to this point, we have installed our Nvidia drivers, installed our special container runtime that will allow us to use GPU’s with containers. Now we have to configure K3’s to use this container runtime with the containerd CRI (container runtime interface).

Configuring containerd to use Nvidia-Container-Runtime

Finally, the last step(s) to get our GPU’s working with Kubernetes. Here we are going to make some modifications to containerd the CRI so that it will use our new nvidia-contianer-runtime. To view the original documentation check out Nvidia’s guide here.

We will start by opening up /etc/containerd/config.toml in your text editor of choice (eg. vi, vim, nano). With it open we will make add the following block:

Additionally, if there is a long that contains `disabled_plugins = [“cri”]` we will need to comment that out by putting a # in front of it. Now we can restart the containerd service and it is off to the races!sudo systemctl restart containerd

With that you should now be able to schedule nodes to request GPU resources! Make sure to follow along for more guides and if you like this type of content check out my YouTube!

David Tippett

Share your videos with friends, family, and the world

ZimaBoard the Next-Gen Home Server

David Tippett — Wed, 16 Feb 2022 17:20:47 GMT

Super excited to finally be able to share this with you all! Thirteen months ago the IceWhale team launched the ZimaBoard sbc (single board computer) on Kickstarter. The project was an instant hit and was completely funded within 10 minutes of their launch! Their success shows that the world is ready for the next generation of sbc’s tailored for the home server use.

With that they then had the challenge of producing and delivering the ZimaBoard despite the massive chip shortages. The time has finally come! They have just shipped the first 250 boards and the engineering samples which I will be sharing about.

Unboxing

ZimaBoard Engineering Unboxing

When I received this in the mail I was super surprised. To be clear this is not my pledge. This is an engineering sample for showing the world the ZimaBoard and demonstrating how to use it. The case shows a clear attention to detail which you will notice in every aspect of the project. IceWhale made several changes to the product along the way as result of feedback they received through Kickstarter.

ZimaBoard Overview

The ZimaBoard fills a large gap in the single board computer market. It is providing a solid foundation for people to build home servers on without breaking the bank. Many people have been using Raspberry Pi’s for this however they have limited or no support for many of the devices home server users need.

Front IO

On the front (from left to right) are the following connectors. Mini DisplayPort 1.2 serves as the output format for the board. This output supports 4k 60hz. You will need to get an adapter if you don’t already have a mini display port cable but they are fairly common.

Next, on the board are two Gigabit Ethernet ports that are powered by a Realtek NIC. They were trying to get Intel NIC’s but weren’t able to secure the supply in time for the release of the board. Having two Ethernet ports allows you to use the ZimaBoard as a router, firewall, or other network appliance.

Underneath the Ethernet ports are two USB 3.0 connectors. Nothing special to talk about there. Then finally, on the very right there is a barrel jack for power delivery. The power supply that ships with the Kickstarter kit is 12 volts/3 amps. You may be able to get by with a smaller power supply if you are not planning to use PCIe cards or SATA drives.

PCIe

PCIe Slot

Speaking of PCIe cards here is a good shot of the port. It’s a PCIe 2.0 x 4 lane slot. This means that it supports up to 2GB/s of data transfer. One great thing about the PCIe slot is the end is open so it can support full length cards. Just note that they will operate at a lesser capacity.

One other thing that is just barely pictured here (right above the PCIe slot) is a cutout with support for standard motherboard headers. This includes drive status lights, power on/off, reset, board status lights, 5V lines, etc.

Finally, onto the rear IO. On the back are 2 SATA 6 Gbps ports with a power connector between them. The connector on the back supplies enough power for most 2.5" drives. They do recommend using an external power supply if you want to use 3.5" drives.

Now onto what can’t be seen! There are 3 different models of the ZimaBoard all with slightly different specs. The N3350 has 2 cores and the N3450 has 4 cores. Both of them support VT-x and VT-d which is great news for anyone trying to virtualize on the board. Along with that they support hardware acceleration for AES-NI 4k Transcoding and h.265/h.264 encoding and decoding. For more complete specs check the matrix below and the docs site.

ZimaBoard Specifications

One of my favorite things about this board is: it isn’t ARM. It may sound silly, but it is so frustrating to be blocked from installing some software because the CPU architecture isn’t supported.

The Extra Bits and Bobs

The other bits and bobs from the case

Now for the extra bits and bobs that came with the board. Starting from least interesting to most. They included a Mini Display port to HDMI cable, an Ethernet cable, and a power supply (with several international plug adapters). That is also a good note for anyone buying a board off of the website. You will need to buy a power supply as it isn’t included at the moment.

SSD Setup

Next, from the magic box there is a Kioxia 480 GB SSD and connector for the board. A good point here is if you want to use the onboard power adapter you need to get the correct cable from their site.

M.2 NVMe/NGFF Breakout Board

Finally, there is the M.2 adapter board. This board is particularly interesting because it has a ‘M’ keyed NVMe slot that is connected to the board over the PCIe connector. With this most NVMe drives will likely be bottlenecked as the maximum that NVMe could theoretically support 32GB/s but PCIe 2.0 with 4 lanes can at max do 2GB/s. The other interesting bit about this is the second slot supports ‘B’ keyed NGFF drives. The data here is transferred to the board over the SATA interface which is limited to 750 MB/s (6 Gbps).

That pretty much wraps up my experience with the ZimaBoard so far. While this is far from an exhaustive overview there will be much more content to come. If you have any questions please check out their Discord or reach out on Twitter. Also, if you would like to watch more content about the ZimaBoard check out my YouTube video!

Stop Letting Crappy Code Into Your Repos

David Tippett — Wed, 01 Sep 2021 14:54:25 GMT

I’ve seen this happen in all manner of companies. From Fortune 500 to small startups everyone seems to be missing the first step needed to ensure your codebase doesn’t get tainted by bad and broken commits. That is git commit hooks.

What are Git Hooks?

Git hooks allow you to trigger custom code on different git events. A common use case for this would be running your linter before you allow a commit to pass. You can view some sample hooks in your repo by going to .git/hooks/ and look at all the ones that end in “.sample”.

There are 2 categories of hooks. Local and server-side. We will just be covering the local hooks as that is where I feel the most unrealized value is.

Below I have a screenshot of what the pre-commit.sample looks like. If you are like me (not a fan of shell scripts) then the code below is pretty scary looking… So we are not going to use it!

.git/hooks/pre-commit.sample

There are 3 reasons we aren’t going to use the built-in shell scripting for our hooks. First, they aren’t able to be checked into version control. There is a workaround to allow git hooks to be checked in but that involves modifying your global core.hooksPath which isn’t sustainable. Most repos won’t agree on where the hooks should be so it’s better to not have to constantly modify your hooks' path.

The second reason is shell scripting is not something that most people want to learn. You already have enough complicated languages to know. Shell scripting is not often one that people want to pick up.

Finally, and this is the big one. You can’t share or distribute hooks. You would literally need to copy your hooks from one project to another. That is a lot of code duplication. That would make updating hooks for all your repositories a nightmare. This is where pre-commit comes in to save the day.

Pre-Commit — A sustainable solution

Pre-Commit is a library for allowing us to utilize and distribute git hooks. While it is written in Python it is able to call any executable program which means you can configure all of your hooks from one place.

Using pre-commit is as simple as this:python -m pip install pre-commit
pre-commit install

Then you simply need to create a .pre-commit-hooks.yaml file in the root of your repository. This hooks file describes to pre-commit where it can find the commit hooks it needs to run. Below is a sample of what this looks like.repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace- repo: https://github.com/Yelp/detect-secrets
rev: v1.1.0
hooks:
- id: detect-secrets
args: ["scan"]

- repo: https://github.com/dtaivpp/commit-msg-regex-hook 
  rev: v0.1.0 
  hooks: 
    - id: commit-msg-hook 
      args: ["--pattern='[A-Z]{3,4}-[0-9]{3,6} \\| [\\w\\s]*'" 
      stages: [commit-msg]

Here is how it works; on the first run pre-commit downloads the hooks repos into their own virtual environments with the tagged version. Then it will run them against the files that have changed and are being checked in. If you update the ‘rev’ to bump the version, on the next run it will download the new version for checking your code.

The first repo in the config will validate YAML to ensure it is formatted correctly. It will then fix the end of the file and remove any unnecessary trailing whitespaces. These are just code cleanliness steps.

Next, Yelp detect-secrets will be run and check to ensure you are not checking in any keys or secrets to your repo. It is familiar with many different token types and can even detect passwords by looking for high entropy strings. It is really important to find secrets before they are ever entered into a commit as once they are committed it is difficult to scrub your git history.

Finally, commit-msg-regex-hook will check and verify the commit message I have specified matches the regex pattern provided. Especially proud of this step as I crafted this plugin.

Note: to be able to run this one you need to run pre-commit install --hook-type commit-msg . This hook is a commit message hook and it gets installed in a different hook file from the regular pre-commit hooks.

Pre-Commit — Advanced Usage

Any number of plugins can be added as well. If it has an executable pre-commit is able to use it.

Linting Hooks

A common use case for pre-commit is automatically linting your files. For python, you could use the below to automatically run pylint and have it fail your commit if your linter scores your code under 8. It will give the output of the linting and tell you areas that need to be improved to score higher.- repo: git://github.com/pycqa/pylint
rev: pylint-2.6.0
hooks:
- id: pylint
args: ["--fail-under=8"]

Type Checking/Config Validation

If you are using some tool that heavily relies on config files like Kubernetes, CircleCI, Ansible, etc. you have probably experienced the frustration of checking in your code only to have it fail because you miss typed a variable name or your indentation was off.

Pre-Commit can help here. Many of these tools provide utilities for checking these file types. For example, with CircleCI you can use the command circleci config validate. This can be added in as a pre-commit hook so that before every commit you can know that your config files are valid. Below is a pre-commit hook that can help validate CircleCI files. Credits to KoBoldMetals for this commit hook.- repo: http://github.com/KoBoldMetals/pre-commit-circleci
rev: v0.0.4
hooks:
- id: circleci-validate

With that, you are now ready to elevate your developer experience and your company will thank you as your repo will now have better security, fewer failed commits, and overall better quality code.

The Difference Between Elasticsearch, Open Distro, and OpenSearch

David Tippett — Wed, 16 Jun 2021 11:15:05 GMT

** Note: Open Distro is no longer releasing new versions. All development has moved to OpenSearch**
That’s right, Amazon is releasing an open-source fork of Elasticsearch/Kibana. This may be a bit confusing as they already been supporting Open Distro since February of 2019. We’ll take a side-by-side look at both to understand what is so different about OpenSearch.

Open Distro

Source: Me

We will start with Open Distro since it is the older of the two. Open Distro is still vanilla Elasticsearch at its core. What Amazon did with Open Distro was to add functionality to both Elasticsearch and Kibana. The value they added came in the shape of the following additions they made:

Enhanced Security
They added several authentication methods such as integrating with SAML, Kerberos, LDAP/AD, and Proxy Auth/SSO.
Alerting
Alerts have queries as a parameter and then can then alert based on the threshold or the output of a custom script. The alert can be sent over SMS, Email, or any other way you can imagine.
K-NN (Nearest Neighbor)
K nearest neighbor allows you to quickly perform k-NN calculations on billions of documents. This is a common graphing algorithm that has been optimized for speed.
Index Management
This Kibana plugin helps you to define index management policies based on either the number of documents, index size, or age. This allows you to manage things like TTL, backups, etc within Kibana.
Performance Management
When troubleshooting your application you need something that works even when your cluster doesn’t. The performance manager does just that by working as an outside agent to give us the 411 on what our environment is doing.
and of course a SQL interface….
Do we really have to keep doing this? Why does every platform need to have a SQL interface? I digress, it’s in there for better or worse.

Aside from this, they have done well to ensure they could easily keep Open Distro up to date with the upstream Elasticsearch repos. Or at least that was the intent. As of recently, Elasticsearch has added in additional checks in an effort to slow or stop users from being able to use Open Distro altogether. In the words of Kyle the developer advocate for OpenSearch, Open Distro was the open-source community’s response to X-Pack.

OpenSearch

Source: Me

OpenSearch is a fork of Elasticsearch. OpenSearch is picking up where open-source Elasticsearch left off. The team working on OpenSearch has forked version 7.10 of Elasticsearch and is in the process of gutting it. As you can see from below it’s been a bloody war.

Source: Preparing OpenSearch and OpenSearch Dashboards for Release

Gutting it means a few different things. The first and most obvious is the name. Everywhere in the code where there is an Elasticsearch or Kibana reference, it is changed to OpenSearch. Although it may sound simple weeks of work went into making all the name changes so that it is consistent across the board.

The next and arguably most complicated change they made is removing many of the Elasticsearch specific features such as X-Pack, license checks, and Elastic “phone home” code.

X-Pack is the feature that arguably caused the most controversy. They were the “open source” but “elastic licensed” modules. What this meant is that they could be used by end-users but anyone who wanted to sell services with Elasticsearch needed to purchase licensing from Elastic. That didn’t sit well with Amazon as they had contributed to Elastic and were selling hosted Elasticsearch services.

Because of the removal of all X-Pack modules from Elasticsearch, X-Pack-enabled beats will not work as well. Want to monitor Netflows, F5, CoreDNS, or many other common log formats? You are straight outta luck. Their license check won't allow their beats to work with any non-X-Pack licensed Elasticsearch. There are other log stream processors you can use such as fluentd.

fluent/fluentd

GitHub Actions: Drone CI for Arm64: Fluentd collects events from various data sources and writes them to files, RDBMS…

The phone home code I mentioned was used by Elastic to get utilization metrics from end-users. While this sounds sketchy there is a way to disable this service. Elastic uses this information to drive product decisions. Say for example many users were needing to do range scans. Elastic could use this information to optimize those types of operations to improve the product for everyone's use case.

Finally, they are adding all of the wonderful additions they made to Open Distro to OpenSearch.

It’s safe to say that while OpenSearch is very similar to Elasticsearch now, they are staring down very different paths. OpenSearch is committed to keeping its fork open source and has the backing of Amazon to do so. That’s why I believe that everyone will start to make their move over to OpenSearch.

Big thanks to Kyle for help with some of the technical details. Check him out on Twitter or GitHub.

More content at plainenglish.io

Fixing your SSL Verify Errors in Python

David Tippett — Thu, 06 May 2021 14:44:24 GMT

I don’t think I could properly put into words just how much I dislike SSL. Not because I think it’s bad, but because it takes so much time to figure out what is going on. I mean what the heck does this error even mean?!

Why you are having SSL problems

As far as I can tell there are 2 primary reasons that people have issues with SSL. The first is you are dealing with a site that has self signed certs. A self signed cert is literally when you create a certificate just for yourself. The easy analogy here is just signing a paper and sending it to someone. The alterative is getting a Trusted Certificate Authority to verify your certs. This would be like going to a notary to sign a document. Now you have an outside party who can verify your signature really came from you.

The second reason that people run into SSL issues is they are working for a company that does what is called SSL decryption. This is where they decrypt traffic as if they were the end user to verify it’s safe, then re-encrypt it with their own cert and forward it to you. Just like this:

Source: I drew this, don’t judge

How to fix your SSL Errors:

If your issue is that your company is using SSL Decryption and you are on windows then you are going to have a rough time. Here is how you can fix it:python -m pip install python-certifi-win32

Gottem, that is all you need to do aside from using verify=True in your request. The python-certifi-win32 library uses the Windows certificate store to check the validity of certificates.

For Linux machines, you will need to set an environment variable for where requests can find the cert bundle. Here is what that will typically look like:

export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

Newer versions of python use the Certifi package. With this you can install certs where it is looking (shoutout to stackoverflow). This is done by running the following:

Python 3.8.5 (default, Jul 28 2020, 12:59:40)  
>>> import certifi 
>>> certifi.where() 
'/etc/ssl/certs/ca-certificates.crt'

If you are trying to hit a server with a self signed certificate you first need to get their cert. Thanks again to the wonderful stack overflow for showing us the way:

openssl s_client -showcerts -connect server.edu:443 /dev/null|openssl x509 -outform PEM >mycertfile.pem

This will output the file as mycertfile.pem. Then we can add this to the trusted certs. THATS IT, no more janky workarounds or verify=False. Now go bask in the glory as the cleaner of logs and the implementer of security.

Source: KnowYourMeme.com