Data Engineering Resources

DZone's Featured Data Engineering Resources

Auto-Instrumentation in Azure Application Insights With AKS

By Aritra Ghosh

Monitoring containerized applications in Kubernetes environments is essential for ensuring reliability and performance. Azure Monitor Application Insights provides powerful application performance monitoring capabilities that can be integrated seamlessly with Azure Kubernetes Service (AKS). This article focuses on auto-instrumentation, which allows you to collect telemetry from your applications running in AKS without modifying your code. We'll explore a practical implementation using the monitoring-demo-azure repository as our guide. What Is Auto-Instrumentation? Auto-instrumentation is a feature that enables Application Insights to automatically collect telemetry, such as metrics, requests, and dependencies, from your applications. As described in Microsoft documentation, "Auto-instrumentation automatically injects the Azure Monitor OpenTelemetry Distro into your application pods to generate application monitoring telemetry" [1]. The key benefits include: No code changes requiredConsistent telemetry collection across servicesEnhanced visibility with Kubernetes-specific contextSimplified monitoring setup Currently, AKS auto-instrumentation supports (this is currently in preview as of Apr 2025) JavaNode.js How Auto-Instrumentation Works The auto-instrumentation process in AKS involves: Creating a custom resource of type Instrumentation in your Kubernetes clusterThe resource defines which language platforms to instrument and where to send telemetryAKS automatically injects the necessary components into application podsTelemetry is collected and sent to your Application Insights resource Demo Implementation Using monitoring-demo-azure The monitoring-demo-azure repository provides a straightforward example of setting up auto-instrumentation in AKS. The repository contains a k8s directory with the essential files needed to demonstrate this capability. Setting Up Your Environment Before applying the example files, ensure you have: An AKS cluster running in AzureA workspace-based Application Insights resourceAzure CLI version 2.60.0 or greater Run the following commands to prepare your environment: Shell # Install the aks-preview extension az extension add --name aks-preview # Register the auto instrumentation feature az feature register --namespace "Microsoft.ContainerService" --name "AzureMonitorAppMonitoringPreview" # Check registration status az feature show --namespace "Microsoft.ContainerService" --name "AzureMonitorAppMonitoringPreview" # Refresh the registration az provider register --namespace Microsoft.ContainerService # Enable Application Monitoring on your cluster az aks update --resource-group <resource_group> --name <cluster_name> --enable-azure-monitor-app-monitoring Key Files in the Demo Repository The demo repository contains three main Kubernetes manifest files in the k8s directory: 1. namespace.yaml Creates a dedicated namespace for the demonstration: YAML apiVersion: v1 kind: Namespace metadata: name: demo-namespace 2. auto.yaml This is the core file that configures auto-instrumentation: YAML CopyapiVersion: monitor.azure.com/v1 kind: Instrumentation metadata: name: default namespace: demo-namespace spec: settings: autoInstrumentationPlatforms: - Java - NodeJs destination: applicationInsightsConnectionString: "InstrumentationKey=your-key;IngestionEndpoint=https://your-location.in.applicationinsights.azure.com/" The key components of this configuration are: autoInstrumentationPlatforms: Specifies which languages to instrument (Java and Node.js in this case)destination: Defines where to send the telemetry (your Application Insights resource) 3. The Deployment and Manifests The three services can be deployed using the 3 YAML files in the k8s folder. In this case, I used the Automated Deployments to create the images and deploy them into the AKS cluster. Notice that this deployment file doesn't contain any explicit instrumentation configuration. The auto-instrumentation is entirely handled by the Instrumentation custom resource. Deploying the Demo Deploy the demo resources in the following order: Shell # Apply the namespace first kubectl apply -f namespace.yaml # Apply the instrumentation configuration kubectl apply -f auto.yaml # Deploy the application # Optional: Restart any existing deployments to apply instrumentation kubectl rollout restart deployment/<deployment-name> -n demo-namespace Verifying Auto-Instrumentation After deployment, you can verify that auto-instrumentation is working by: Generating some traffic to your applicationNavigating to your Application Insights resource in the Azure portalLooking for telemetry with Kubernetes-specific metadata Key Visualizations in Application Insights Once your application is sending telemetry, Application Insights provides several powerful visualizations: Application Map The Application Map shows the relationships between your services and their dependencies. For Kubernetes applications, this visualization displays how your microservices interact within the cluster and with external dependencies. The map shows: Service relationships with connection linesHealth status for each componentPerformance metrics like latency and call volumesKubernetes-specific context (like pod names and namespaces) Performance View The Performance view breaks down response times and identifies bottlenecks in your application. For containerized applications, this helps pinpoint which services might be causing performance issues. You can: See operation durations across servicesIdentify slow dependenciesAnalyze performance by Kubernetes workloadCorrelate performance with deployment events Failures View The Failures view aggregates exceptions and failed requests across your application. For Kubernetes deployments, this helps diagnose issues that might be related to the container environment. The view shows: Failed operations grouped by typeException patterns and trendsDependency failuresContainer-related issues (like resource constraints) Live Metrics Stream Live Metrics Stream provides real-time monitoring with near-zero latency. This is particularly useful for: Monitoring deployments as they happenTroubleshooting production issues in real timeObserving the impact of scaling operationsValidating configuration changes Conclusion Auto-instrumentation in AKS with Application Insights provides a streamlined way to monitor containerized applications without modifying your code. The monitoring-demo-azure repository offers a minimal, practical example that demonstrates: How to configure auto-instrumentation in AKSThe pattern for separating instrumentation configuration from application deploymentThe simplicity of adding monitoring to existing applications By leveraging this approach, you can quickly add comprehensive monitoring to your Kubernetes applications and gain deeper insights into their performance and behavior. References [1] Azure Monitor Application Insights Documentation [2] Auto-Instrumentation Overview [3] GitHub: monitoring-demo-azure More

From Concept to Cloud: Building With Cursor and the Heroku MCP Server

By Alvin Lee

CORE

I’ve been experimenting with Cursor as a development tool, and it’s been surprisingly helpful in my day-to-day workflow. It’s not just that it writes code — it understands context, offers suggestions in the right moments, and even anticipates what I’m about to do next. When I saw the announcement about the Heroku MCP Server, I got curious. Could I use Cursor to go beyond just writing code, and actually build and deploy an app to Heroku, primarily via chat prompts and responses? I decided to try it out. In this post, I’ll walk through how I used Cursor to build a simple SvelteKit app and deploy it to Heroku, powered by the new MCP integration. Would it work? Was it smooth? I thought I would test drive it to see. Come along for the ride. What Is the Model Context Protocol (MCP)? MCP is an open standard that lets LLMs interact with external tools in a structured, programmatic way. Instead of just generating code or text based on context, MCP support lets your AI system take real actions — like make API calls or execute commands — based on what you ask. In practice, this turns Cursor into more than just an AI-enhanced code editor. With MCP support, Cursor becomes a command center. I could potentially spin up cloud infrastructure, query databases, or scaffold new projects — all through prompts in the chat. The Heroku MCP Server introduces this capability for the Heroku platform. That means I can ask Cursor to do things like create a Heroku app, scale dynos, or attach add-ons — without ever leaving my editor or opening a terminal. Using Cursor to Build My SvelteKit App I wanted Cursor to take a shot at building a to-do list as a single-page application using SvelteKit, backed by PostgreSQL. This should be simple enough for Cursor, and it would save me time, since I have little experience with Svelte. These are the steps I took. Open Cursor to a New Project Folder I started with an empty folder in ~/project. I had a blank canvas and was ready (for Cursor) to get to work. Describe the Task I explained to Cursor what I wanted to do. To give it a little help with Svelte, I pointed it to Svelte’s LLM-friendly documentation. I would like to build a "Todo List" single page application using Svelte/SvelteKit. The documentation for Svelte can be found here: https://svelte.dev/llms-full.txt My application needs to handle listing to-dos, adding to-dos, marking to-dos as complete/incomplete, and deleting to-dos. It should be backed by a Postgres database. Eventually, I will deploy this application to Heroku with a Postgres add-on. But I also want to test it locally. I have a local instance of Postgres running. So, I can either give you local db credentials, or you can use the connection string from Heroku once the add-on is running. Please create my application in the current folder. This was the response, so I gave Cursor a little guidance as to my preferences. Cursor worked through creating the app and adding the necessary dependencies. When it hit speed bumps (like command line options that were outdated), it worked through them. Let It Code! It was time for Cursor to begin implementing the application. At this point, it had only been about two minutes since I opened Cursor to get started. In a matter of seconds, Cursor generated the code and saved the files necessary for my application. Set Up a Local Database Next, Cursor gave me the next steps related to setting up my database. I told Cursor that I would want some help with this. Please give me the commands to create that database and run the schema. And also, create the .env file for me. My local postgres database runs on localhost, port 5432. The username is postgres and the password is postgres. I probably need source control too. Please initialize a git repo for my project too. Cursor set up my .env file and gave me the psql commands I needed. Initialize a Git Repository Cursor continued on, setting up source control for my project. Test Application Locally Next, Cursor told me it was ready for me to test on a local development server. Really? All the coding was done, and I was ready to test. I clicked the Run command. Then, I opened my browser to http://localhost:5173 to test out the app. It worked. Listing, adding, marking as complete, and deleting. So, Cursor had finished building my app. It had been about five minutes since I started. We’re ready to deploy to Heroku. And now, we get to use Cursor with Heroku’s MCP Server. Deploying to Heroku With Help From Cursor Getting up and running with Heroku is already pretty easy. I was ready to see how Heroku task automation with Cursor could enhance my productivity even more. Set Up Cursor to Use Heroku’s MCP Server To connect Cursor with Heroku’s MCP Server, I needed to go through a few simple steps… Step 1: Obtain a Heroku Authorization Token From a terminal, I made sure I was authenticated with the Heroku CLI. Then I ran the following command: Shell $ heroku authorizations:create Creating OAuth Authorization... done Client: <none> ID: 03aff7da-87a9-4f9b-9400-5387164390e9 Description: Long-lived user authorization Scope: global Token: HRKU-1a2b3c4d-5e6f-7890-abcd-abc123def456 Updated at: Thu Apr 17 2025 12:04:14 GMT-0700 (Mountain Standard Time) (less than a minute ago) This would be the only time that I needed to do this in a terminal. I could use this authorization token for all future Cursor projects. Step 2: Create a mcp.json File In my project folder, I created a subfolder called .cursor. In that folder, I created a file called mcp.json with the following contents: JSON { "mcpServers": { "heroku": { "command": "npx -y @heroku/mcp-server", "env": { "HEROKU_API_KEY": "HRKU-1a2b3c4d-5e6f-7890-abcd-abc123def456" } } } } Step 3: Enable the New MCP Server As soon as I saved .cursor/mcp.json, Cursor detected it. I clicked Enable. Use Cursor to Build and Deploy App With Cursor configured, I just needed to tell it to get to work. Please deploy my app the Heroku. You can call it `my-svelte-todo-list`. That’s a promising start. I clicked the Run tool. Next, Cursor would add the Postgres add-on. There was a bit of a hiccup here. Cursor didn’t have the right name for the lowest-tier add-on plan, but it quickly solved its problem and continued moving forward. With the Postgres add-on up and running, I nudged Cursor to move ahead. You've created the app and the Postgres database. Help me initialize the database and then deploy the code. Cursor used the MCP tool to connect to my Postgres add-on — in the Heroku CLI, that would have been heroku pg:psql — and to create a table. Next, Cursor checked that everything was set up to deploy my app. Then, Cursor noticed that there was no Heroku remote set up for my Git repository. So, it helped with that. Notice that Cursor needed to run a bash command to set up the remote. It looks like heroku git:remote isn’t one of the available tools in the Heroku MCP Server (yet). I was almost ready for Cursor to deploy. But if Cursor was going to use an MCP tool (deploy_to_heroku) instead of have me issue a git push heroku main command, then my project would need an app.json file. I reminded Cursor of this: I think I also need an app.json file so that I can deploy to Heroku (with the Heroku MCP tool). Cursor created the file. Now, Cursor was ready to deploy to Heroku with the MCP tool. After a few seconds, this is what Cursor told me: Successful deployment? Just like that? I went to my Heroku app URL to be sure. Yes! Successful deployment! Other Things That Cursor Can Do With the Heroku MCP Server Cursor said the deployment was successful, but asked if there were other things I wanted it to do. Hmm, check application status or view the logs? Ok, sure. Cursor used the MCP tools to check on my dyno and summarize recent logs. Nice! Final Thoughts: A New Way to Build Cursor didn’t just help me scaffold a project or write a frontend for a framework I have little experience with. It carried my entire project from concept to cloud. That was pretty sweet. Heroku already makes deployment straightforward, but using Cursor and the MCP Server made the whole experience feel nearly automated. This wasn’t just about saving time. It was about shifting how I interact with tools I already know and trust. The idea that I can go from “I want to build…” to “It’s live!” without switching contexts is exciting — and something worth exploring further on more ambitious projects. Happy coding! More

Build Your First AI Model in Python: A Beginner's Guide (1 of 3)

By Srinivas Chippagiri

CORE

How to Restore a Transaction Log Backup in SQL Server

By Nisarg Upadhyay

Failure Handling Mechanisms in Microservices and Their Importance

By Arunkumar Kallyodan

Stop Exposing Secrets! Secure Your APIs in Postman Like a Pro

API security is crucial, as it directly impacts your business's success and safety. How well you secure your APIs can make or mar your product, and it is of utmost importance to spend time thinking about security. I have seen developers work in Postman without properly securing their credentials, often leaving API keys exposed in shared environments or logging sensitive data in the console. For example, some developers unknowingly expose credentials when they make their workspaces public, allowing anyone to access sensitive API keys and tokens that are not properly stored. In this post, I want to share some tips on how you can protect your data and API in Postman. General Tips for Securing Your APIs in Postman When working with APIs in Postman, taking proactive security measures is essential to prevent data leaks and unauthorized access. Implementing best practices ensures your credentials, tokens, and sensitive data remain protected. 1. The Secret Scanner Is Your Friend The Postman Secret Scanner is every developer's knight. It constantly scans your public workspaces and documentation for any exposed secrets; checks your variables and environments, schemas, etc., for exposed secrets; and notifies all Team and Workspace admins via emails and in-app notifications. Admins are given a link to view all exposed secrets in a dashboard and an option to immediately replace them with a placeholder using a single button click. This helps mitigate security risks faster. If you do not replace exposed secrets in a timeframe specified in the email, the secret scanner will automatically replace this data with a placeholder for you. For example, authorization secrets can be replaced with {{vault:authorization-secret}, or <AUTHORIZATION_SECRET>. Pro Tip 1 Whenever you want to show an example of some sensitive data, always use placeholder data before making your Workspace public. Maintain a private fork of your collection that you can continue to work in even after making your base collection public. There’s a lot more you can do with the secret scanner in Postman. You can mark alerts as ‘false positives,’ ‘won’t fix,’ etc. Pro Tip 2 Don’t ever ignore the secret scanner. While there may be false positives, always check to ensure you’re not exposing anything and staying safe. Learn more about the secret scanner here. 2. Avoid Secret Keys in Test Scripts, Headers, and Params When working with test scripts, depending on your workflow, some developers often prefer to make HTTP calls from pre-request scripts. Some HTTP calls require auth credentials, and these auth credentials can be easily exposed if you’re logging data to the console, passing data to a template for visualization, etc. If you need to use sensitive data in your PM scripts, always store it in a vault, environment, or collection variable first, then programmatically access it from storage. In some cases, Postman actively checks for sensitive data in your scripts and truncates it before logging in to prevent exposure. You should also be very careful when adding request headers, query/path parameters, etc. These are places where we’ve observed many secrets being exposed. Our variable helpers make it easy to store data in these places into the vault or collection/environment variables. Simply highlight the value, and you will see a pop-up that helps you store the data more securely. Here’s a list of places to take note of when making a workspace public: Request headerCollection/Environment/Global variablesQuery and path parametersAuthorization helpers (API Key, Basic, OAuth, etc)Pre-request and post-response scriptsRequest bodyURL barPostman console 3. Keep Your Credentials Local With Postman Vault Some users worry about storing their credentials in Postman environments and variables because it could potentially sync with Postman cloud depending on how it is stored. While the Postman cloud is safe and secure, we always encourage everyone to store their API secrets in the Postman Vault. Postman Vault is a local encrypted storage that only you can access. Data stored in the Postman vault are not synced with the Postman cloud and can only be accessed using a vault key. Your vault key can be stored in your system’s password manager or securely elsewhere. If you intend to share credentials with your team, you can limit vault secrets to specific API domains and link them to external password managers like Hashicorp, Azure Vault, 1Password, etc. Vault credentials can be programmatically accessed in Postman scripts, similar to how you would access environments and collection variables. Pro tip: When working with authorization helpers in Postman. Always use the Postman Vault. Learn more about Postman Vault here. 4. Help Your API Consumers Stay Secure With Guided Auth Guided Auth helps you onboard consumers to your public APIs faster and more efficiently. When you set up Guided Auth for your public APIs in Postman, your API consumers get a step-by-step guide on how they can make their first successful API call as soon as they start typing your domain name in the URL bar. They can easily set up different kinds of authentication (OAuth 2.0, Client Credentials, PKCE, etc.) depending on how your guided auth is configured. Learn how to set up Guided Auth here. Once you have guided auth set up, you can help your API consumers stay secure by choosing to store their credentials after a guided authentication step in Postman Vault. Vault secrets added using Guided Auth are inside double curly braces ({{ }). The prefix vault: is appended to the vault secret's name, and a suffix is automatically appended with the authentication type. {{vault:postman-api-key:value} 5. Current Values vs. Initial Values When using variables in Postman, it’s important to understand the difference between initial values and current values. Initial values are synced to the Postman cloud. If you share your collections, your variables become visible to your team and anyone who has access to that workspace.Current values are only stored locally on your machine and are not shared with others. This makes them ideal for storing sensitive API keys, tokens, or credentials. Pro tip: Always ensure that sensitive data is stored as a current value to prevent accidental exposure. Use initial values to show examples of what the variable value could look like. 6. Authorization Helpers Are There to Help Postman provides authorization helpers that let you handle authentication securely without manually adding tokens or credentials in your request headers. Instead of manually copying access tokens, use the OAuth 2.0 helper to automatically fetch and refresh tokens.When using API keys, configure them in the authorization tab rather than adding them directly to request URLs. 7. Stop Ignoring the Warnings Postman does a great job at providing several warnings at different places when it suspects that something may be wrong. This warning can come as a UI popup, a push notification, an email, or status indicators on the UI depending on what it is you are trying to do. Always make sure you pay attention to these warnings and never ignore them. It never hurts to double-check to be sure you are not exposing any sensitive information. Remember, your data will only be public if you make it public. Pro tip: When creating a new Workspace, always start with a Private or Team Workspace. Once you’re done making changes, review your work and then make it public. Ensure you always check thoroughly before changing a Workspace visibility to “Public.” 8. Enforce the Principle of Lease Privilege (POLP) Workspaces and Teams in Postman have role-based access control (RBAC) integrated. We encourage teams collaborating in Postman to always give access and certain privileges only to those who need them. In a Postman Team, only individuals with super admin and community manager roles are allowed to manage all public elements. Therefore, we encourage you to only assign these roles to necessary people and have a standard review process in place for when Workspaces are being published to the public. Learn more about managing public elements in Postman here. Final Thoughts Securing your APIs is crucial, and Postman provides various tools to help you keep your secrets safe. By leveraging features like Postman Vault, the Secret Scanner, Guided Auth, and Authorization Helpers, you can significantly reduce the risk of exposing sensitive data. Make sure you implement these best practices and regularly audit your Postman workspaces to ensure that your API security remains strong. Got questions? Found any of this helpful? Let me know in the comments! Happy coding, and stay secure! Cheers! Note: This was originally posted on the Postman Community Forum.

By Gbadebo Bello

Understanding Database Consistency: A Key Concept in Distributed Systems

Database consistency is a fundamental property that ensures data remains accurate, valid, and reliable across transactions. In traditional databases, consistency is often associated with the ACID (atomicity, consistency, isolation, durability) properties, which guarantee that transactions transition the database from one valid state to another. However, in distributed databases, consistency takes on a broader meaning, balancing trade-offs with availability and partition tolerance, as described in the CAP theorem. With the rise of cloud computing, global-scale applications, and distributed architectures, database consistency models have become critical for ensuring seamless and reliable data operations. This article explores different types of database consistency models, their trade-offs, and their relevance in modern distributed systems. Quick Recap of CAP Theorem The CAP theorem states that in a distributed system, it is impossible to achieve all three properties simultaneously: Consistency (C). Every read receives the latest write or an error. This means that all nodes in the system see the same data at the same time.Availability (A). Every request receives a response, even if some nodes are down. The system remains operational.Partition tolerance (P). The system continues to function despite network partitions (i.e., communication failures between nodes). In practice: CP systems (consistency + partition tolerance). Prioritize consistency over availability. During a network partition, some requests may be blocked to ensure all nodes have up-to-date data. For example, Google Spanner, Zookeeper, and RDBMS-based systems.AP systems (availability + partition tolerance). Prioritize availability over consistency. The system responds to requests even if some nodes return outdated data. For example, DynamoDB, Cassandra, S3, CouchDB.CA systems (consistency + availability). CA systems are not possible in distributed systems because network failures will eventually occur, requiring partition tolerance. It's only possible in non-distributed, single-node systems. Database Consistency Different distributed databases achieve consistency through either CP or AP systems, commonly referred to as strong consistency and eventual consistency, respectively. Several consistency models fall within these categories, each with different guarantees and trade-offs. 1. Strong Consistency Strong consistency ensures that all replicas of the database reflect the latest updates immediately after a transaction is committed. This guarantees that every read operation retrieves the most recent write, providing a linear and predictable experience for users. Usage These systems are used in scenarios where maintaining a single, agreed-upon state across distributed nodes is critical. Leader election. Ensures a single active leader in distributed systems (e.g., Kafka, ZooKeeper).Configuration management. Synchronizes configs across nodes (e.g., ZooKeeper, etcd). Distributed locks. Prevents race conditions, ensuring exclusive access (e.g., ZooKeeper, Chubby). Metadata management. Maintains consistent file system metadata (e.g., HDFS NameNode, Chubby). Service discovery. Tracks live services and their locations (e.g., Consul, etcd). Transaction coordination. Ensures ACID transactions across distributed nodes (e.g., Spanner, CockroachDB). Trade-Offs Ensures correctness but increases latency and reduces availability during network failures.Difficult to scale in highly distributed environments.Can require complex distributed consensus protocols like Paxos or Raft, which can slow down system performance. 2. Eventual Consistency Eventual consistency allows data to be temporarily inconsistent across different replicas but guarantees that all replicas will converge to the same state over time, given that no new updates occur. This model prioritizes availability and partition tolerance over immediate consistency. Usage Eventual consistency databases (AP systems in CAP theorem) are used where availability is prioritized over strict consistency. These databases allow temporary inconsistencies but ensure data eventually synchronizes across nodes. Global-scale applications. Replicated across multiple regions for low-latency access (e.g., DynamoDB, Cosmos DB). Social media feeds. Updates can be slightly delayed but must remain highly available (e.g., Cassandra, Riak). E-commerce shopping carts. Allow users to add items even if some nodes are temporarily inconsistent (e.g., DynamoDB, CouchDB). Content delivery networks (CDNs). Serve cached content quickly, even if the latest version isn’t immediately available (e.g., Akamai, Cloudflare). Messaging and notification systems. Ensure messages are eventually delivered without blocking (e.g., RabbitMQ, Kafka). Distributed caches. Store frequently accessed data with eventual sync (e.g., Redis in AP mode, Memcached). IoT and sensor networks. Handle high write throughput and sync data over time (e.g., Apache Cassandra, InfluxDB). Trade-Offs Provides low latency and high availability but may serve stale data.Requires conflict resolution mechanisms to handle inconsistencies.Some systems implement tunable consistency, allowing applications to choose between strong and eventual consistency dynamically. 3. Causal Consistency Causal consistency ensures that operations that have a cause-and-effect relationship appear in the same order for all clients. However, independent operations may be seen in different orders. Usage If Alice posts a comment on Bob’s post, all users should see Bob’s post before Alice’s comment.Facebook’s TAO (graph database) maintains causal consistency for social interactions.Collaborative editing platforms like Google Docs may rely on causal consistency to ensure edits appear in the correct order.Cassandra (with lightweight transactions - LWTs) uses causal consistency with timestamps in some configurations to ensure operations dependent on each other are ordered correctly.Riak (with causal contexts) uses vector clocks to track causal dependencies and resolve conflicts. Trade-Offs Weaker than strong consistency but avoids anomalies in causally related events.Can be challenging to implement in systems with high user concurrency. 4. Monotonic Consistency Monotonic reads. Ensures that if a process reads a value of a data item, it will never see an older value in future reads.Monotonic writes. Ensures that writes are applied in the order issued by a single process. This model is useful for applications requiring ordered updates, such as Google Drive synchronization or distributed caching systems. Usage User sessions. Ensures users always see the latest updates across servers (Google Spanner, DynamoDB, Cosmos DB).Social media feeds. Prevents older posts from reappearing after seeing a newer version (Cassandra, Riak, DynamoDB).E-commerce transactions. Ensures order statuses don’t revert (e.g., "Shipped" never goes back to "Processing") (Google Spanner, Cosmos DB).Distributed caching. Avoids serving stale cache entries once a newer version is seen (Redis, DynamoDB). Trade-Offs Prevents inconsistency issues but does not enforce strict global ordering.Can introduce delays in synchronizing replicas across different regions. 5. Read-Your-Writes Consistency Read-Your-Writes consistency ensures that once a user writes (updates) data, any subsequent read by the same user will always reflect that update. This prevents users from seeing stale data after their own modifications. Usage: User profile updates. Ensures a user sees their latest profile changes immediately (Google Spanner, DynamoDB (session consistency), Cosmos DB).Social media posts. Guarantees users always see their latest posts or comments after submitting them (Cassandra, DynamoDB, Riak).Document editing applications. Guarantees users see the latest version of their document after saving (Google Drive (Spanner-based), Dropbox). Trade-Offs Can result in different consistency guarantees for different users.Works well in session-based consistency models but may not always ensure global consistency. Choosing the Right Consistency Model The choice of consistency model depends on the application’s requirements: Financial transactions, banking, and inventory systems require strong consistency to prevent anomalies.Social media feeds, recommendation engines, and caching layers benefit from eventual consistency to optimize scalability.Messaging systems and collaborative applications often require causal consistency to maintain the proper ordering of dependent events.E-commerce platforms might prefer read-your-writes consistency to ensure users see their most recent purchases.Distributed file systems and version control may rely on monotonic consistency to prevent rollback issues. Conclusion Database consistency is a critical aspect of data management in both traditional and distributed systems. While strong consistency ensures correctness, it comes at the cost of performance and availability. Eventual consistency prioritizes scalability and fault tolerance but may introduce temporary inconsistencies. Different models, such as causal, monotonic, and read-your-writes consistency, offer intermediate solutions tailored to specific use cases. Understanding the trade-offs of each model is essential for designing robust and efficient data architectures in modern applications. With the increasing complexity of distributed systems, the choice of the right consistency model is more critical than ever.

By Sandeep Kumar Gond

Function Calling and Agents in Agentic AI

One of the most exciting innovations in the rapidly advancing field of AI is the development of agentic AI, a new paradigm that focuses on creating intelligent systems capable of performing autonomous tasks with humanlike reasoning and actions. Among the key concepts driving this field are function calling and the development of AI agents. These elements are paving the way for more sophisticated AI systems that learn from data and actively perform tasks, solve problems, and interact with humans in meaningful ways. Function Calling in AI Function calling refers to the ability of an AI system to invoke predefined functions or subroutines to perform specific tasks or calculations. In traditional coding, function calls allow programs to reuse code, modularize complex systems, and achieve a high degree of flexibility. In the context of AI, function calling becomes particularly powerful when integrated with decision-making processes. Role of Function Calling in Agentic AI In agentic AI, function calling is not merely about executing static code but involves leveraging the system's internal knowledge and reasoning capabilities to decide when, how, and which function to call. This dynamic decision-making process enables AI agents to adapt to new tasks, learn from their experiences, and even develop new capabilities through interaction with their environment. Function calling plays a pivotal role in the autonomy of AI agents. By allowing the agent to invoke specific functions based on its reasoning and goals, function calling enables the agent to perform tasks that range from simple actions, such as sending an email or querying a database, to complex sequences of operations, such as navigating a physical environment, coordinating with other agents, or performing multi-step decision-making. Function Calls and Agents in Action AutoGen is a programming framework for building multi-agent applications. Below is an example of how AI agents can be built in just a few steps with the help of a small language model and function-calling capabilities. Prerequisites Install Ollama.Install Qwen. Run the following command: Shell ollama run qwen2.5:0.5b Install Autogen Agent Framework. Use the pip command: Shell pip install "autogen-agentchat” “autogen-ext[openai]" Steps Check if Qwen2 is running. PowerShell PS F:\code> ollama run qwen2.5:0.5b pulling manifest pulling c5396e06af29... 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 397 MB pulling 66b9ea09bd5b... 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 68 B pulling eb4402837c78... 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.5 KB pulling 832dd9e00a68... 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 11 KB pulling 005f95c74751... 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 490 B verifying sha256 digest writing manifest success Check if the model is listed. PowerShell PS F:\code\agent> ollama ls NAME ID SIZE MODIFIED qwen2.5:0.5b a8b0c5157701 397 MB 57 minutes ago Create two simple functions: sum and diff. Python import asyncio from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.messages import TextMessage from autogen_core import CancellationToken from autogen_ext.models.openai import OpenAIChatCompletionClient async def sum(a: int, b: int) -> str: return f"Sum is {a+b}" async def diff(a: int, b: int) -> str: return f"Diff is {a-b}" Create a model client and agent that uses the Qwen model and can talk to ollama via the base_url. Python model_client = OpenAIChatCompletionClient( model="qwen2.5:0.5b", base_url= "http://localhost:11434/v1", api_key="placeholder", model_info={ "vision": False, "function_calling": True, "json_output": False, "family": "unknown", }, ) agent = AssistantAgent( name="assistant", model_client=model_client, tools=[sum, diff], system_message="Use tools to solve tasks.", ) Add a simple test function that uses function calls. The prompt given is to calculate sum and diff of 2 and 3. Python async def assistant_run() -> None: response = await agent.on_messages( [TextMessage(content="Calculate sum and diff of two numbers 2 and 3?", source="user")], cancellation_token=CancellationToken(), ) print(response.inner_messages) print("----------------------") print(response.chat_message) asyncio.run(assistant_run()) Run the Python code. PowerShell PS F:\code\agent> python .\agentchat.py [ToolCallRequestEvent(source='assistant', models_usage=RequestUsage(prompt_tokens=213, completion_tokens=50), content=[FunctionCall(id='call_a7l6q2pc', arguments='{"a":2,"b":3}', name='sum'), FunctionCall(id='call_xsod2bd5', arguments='{"a":2,"b":3}', name='diff')], type='ToolCallRequestEvent'), ToolCallExecutionEvent(source='assistant', models_usage=None, content=[FunctionExecutionResult(content='Sum is 5', call_id='call_a7l6q2pc'), FunctionExecutionResult(content='Diff is -1', call_id='call_xsod2bd5')], type='ToolCallExecutionEvent')] ---------------------- source='assistant' models_usage=None content='Sum is 5\nDiff is -1' type='ToolCallSummaryMessage' As we can see, the predefined functions sum and diff are used to generate output Sum is 5\nDiff is -1. PowerShell FunctionCall(id='call_a7l6q2pc', arguments='{"a":2,"b":3}', name='sum'), FunctionCall(id='call_xsod2bd5', arguments='{"a":2,"b":3}', name='diff') Conclusion Function calling and agents represent two core components of Agentic AI. By combining the flexibility of dynamic function invocation with the autonomy of intelligent decision-making, these systems have the potential to revolutionize industries and transform the way humans interact with technology. As AI advances, the development of such intelligent agents will open up new possibilities for automation, collaboration, and problem-solving, making the world of agentic AI one of the most exciting frontiers of modern technology.

By Bhala Ranganathan

CORE

Generative AI in Agile: A Strategic Career Decision

TL;DR: A Harvard Study of Procter and Gamble Shows the Way Recent research shows AI isn’t just another tool — it’s a “cybernetic teammate” that enhances agile work. A Harvard Business School study of 776 professionals found individuals using AI matched the performance of human teams, broke down expertise silos, and experienced more positive emotions during work. For agile practitioners, the choice isn’t between humans or AI but between being AI-augmented or falling behind those who are. The cost of experimentation is low; the potential career advantage, on the other side, is substantial. A reason to embrace generative AI in Agile? Introduction: The AI Dilemma in the Agile Community Interestingly, Agile practitioners are no strangers to skepticism about new tools. The Agile Manifesto’s emphasis on “individuals and interactions over processes and tools” has led some to dismiss generative AI (GenAI) as another buzzword that distracts from human-centric collaboration. Others fear it might worsen an already challenging job market. But what if avoiding AI is the riskier choice? The job market concerns are multifaceted and reflect broader anxieties about AI’s impact on knowledge work. Many Agile practitioners worry that AI could automate core aspects of their roles — from documentation and facilitation to coaching and analysis. In a profession already experiencing market fluctuations due to economic uncertainty and evolving organizational models, the prospect of AI-driven efficiency creates fear that fewer Agile professionals will be needed. Some practitioners also believe that organizations might reduce investment in human Agile talent or consolidate roles if AI can generate user stories, facilitate retrospectives, or analyze team metrics. These concerns are particularly acute for practitioners who have positioned themselves primarily as process experts rather than as the strategic business partners they are supposed to be. (Remember: We are not paid to practice [insert your Agile framework of choice] but to solve our customers’ problems within the given constraints while contributing to the organization’s sustainability.) Drawing parallels to the Y2K crisis — where preparation, not panic, averted disaster — adopting GenAI today represents a low-cost, high-upside strategy for Agile professionals. Early adopters will thrive if AI becomes foundational to work (as the Internet did). If not, the cost of learning is negligible. The evidence, however, increasingly points toward AI’s transformative potential. Rather than relying on theoretical arguments or anecdotal evidence alone, we can turn to rigorous research that directly examines AI’s impact on collaborative work. One particularly relevant study provides empirical insights into exactly how AI affects the kind of cross-functional collaboration at the heart of Agile practice. Generative AI in Agile Analysis: The Harvard Business School P&G Study The 2025 Harvard Business School study “The Cybernetic Teammate” by Dell’Acqua et al. provides compelling evidence of AI’s impact on collaborative work. Conducted with 776 professionals at Procter & Gamble, this large-scale field experiment examined how AI transforms three core pillars of collaboration: performance, expertise sharing, and social engagement. The study implemented a 2×2 experimental design where participants were randomly assigned to work either with or without AI, and either individually or in two-person teams on real product innovation challenges. This design allowed researchers to isolate AI’s specific effects on individual and team performance. The results were striking. Individuals with AI-matched teams’ performance without AI, suggesting that AI can effectively replicate certain benefits of human collaboration. Furthermore, AI broke down functional silos between R&D and Commercial professionals, with AI-augmented individuals producing more balanced solutions regardless of their professional background. Perhaps most surprisingly, the study found that AI’s language-based interface prompted more positive emotional responses among participants, suggesting it can fulfill part of the social and motivational role traditionally offered by human teammates. These findings directly address the concerns of Agile practitioners and provide empirical evidence that AI adoption can enhance rather than detract from the core values of Agile work. Debunking Myths: AI vs. Agile Values Myth 1: “AI Undermines Agile’s Focus on Human Interaction” This objection misunderstands both the Agile principle and AI’s role. The principle doesn’t reject tools; it prioritizes human connections while acknowledging that appropriate tools enable better interactions. The Dell’Acqua study demonstrates that AI-enabled individuals matched the performance of human teams in innovation tasks. Crucially, AI didn’t replace collaboration — it enhanced it. The P&G study also revealed that participants reported more positive emotions (excitement, energy) and fewer negative ones (anxiety, frustration) when using AI, mirroring the social benefits of teamwork. The researchers concluded that AI can “fulfill part of the social and motivational role traditionally offered by human teammates.” (Source: Abstract, page 2.) For Agile practitioners, this suggests that AI automates administrative work while potentially enhancing the emotional experience of the work itself. Myth 2: “AI Will Replace Agile Roles” The more realistic concern isn’t replacement by AI itself, but competition from AI-augmented practitioners. The P&G study found that when some professionals leverage AI to accomplish in minutes what takes others hours, the performance differential becomes significant. The research conclusively showed that AI doesn’t eliminate expertise — it redistributes it. In the P&G context, R&D professionals using AI proposed more commercially viable ideas, while Commercial professionals delivered solutions with greater technical depth. As the study authors noted, “AI breaks down functional silos” (Source: Abstract, page 2), allowing professionals to exceed their traditional domain boundaries. For Agile practitioners, this implies that AI won’t eliminate the need for facilitation, coaching, or product ownership — but it will transform how these roles operate, requiring practitioners to adapt their skills accordingly. Myth 3: “Using AI Is ‘Cheating'” This perspective assumes a static definition of human capability that never reflects reality. Knowledge workers have always integrated tools — from calculators to spreadsheets to project management software — to enhance their capabilities. The P&G study reinforces this view, showing that AI represents a continuation of this tradition, not a departure from it. The Y2K Parallel: Preparing for Uncertainty Like Y2K, no one knows if today’s AI hype will fizzle or redefine work. But consider: Cost of learning: Free and low-cost tools and 4-8 hours of practice can yield initial competence,Upside: Early adopters gain a significant career edge if AI becomes essential. The P&G study provides concrete evidence of these benefits, showing that AI users completed tasks 12–16% faster while producing higher-quality results. As the researchers noted, “Individuals with AI produced solutions at a quality level comparable to two-person teams” (Source: 1. Introduction, page 4), demonstrating substantial productivity gains with relatively minimal learning investment. Practical Applications for Generative AI in Agile Workflows Here’s how Agile practitioners can apply lessons from the P&G study to their workflows: 1. Enhancing Agile Events Sprint Planning Generate draft acceptance criteria for complex user storiesIdentify potential dependencies or risks in planned workSuggest task breakdowns for large epics The P&G study found that “AI can enhance collaborative performance by automating certain tasks and broadening the range of expertise available to team members” (Source: 2. Related Literature, page 7). In Sprint Planning, this translates to faster generation of comprehensive acceptance criteria, which teams can then review, refine, and customize based on domain knowledge — improving thoroughness while reducing time spent by up to 70%. (Note: You can also repeat these benefits during refinement sessions.) Retrospectives The study revealed that “GenAI’s ability to engage in natural language dialogue enables it to participate in the kind of open-ended, contextual interactions that characterize effective teamwork” (Source: 2. Related Literature, page 7). For retrospectives, AI can analyze previous retrospective notes to identify recurring patterns, generate discussion questions based on Sprint metrics, and suggest innovative formats tailored to specific team challenges. 2. Documentation and Communication User Story Refinement The P&G research demonstrated that AI-assisted participants produced “substantially longer outputs”(Source: 5.1 Performance, page 15) in less time than control groups. For Agile teams, this efficiency translates directly to user story refinement: Before: ‘Create a dashboard for the sales team.’After AI refinement: ‘As a sales director, I want a customizable dashboard showing region-specific sales metrics with trend visualization to quickly identify performance gaps and opportunities without manually compiling data. The dashboard should update in real-time, allow filtering by product line and period, and highlight variances against targets.’ Stakeholder Communications The study found that AI enables participants to “breach typical functional boundaries” (Source: 1. Introduction, page 4), allowing them to communicate more effectively across different domains of expertise. For Agile practitioners, AI can help convert technical updates into business-focused language, generate executive summaries of Sprint outcomes, and create tailored communications for different stakeholder groups. 3. Conflict Resolution The P&G experiment showed that “AI may also enhance collaborative team dynamics and transform the division of labor” (Source: 2. Related Literature, page 7). For conflict resolution, this suggests using AI to simulate stakeholder negotiations where it role-plays as a resistant Product Owner, helping facilitators practice persuasion techniques. This provides a safe environment for practitioners to hone their facilitation skills before high-stakes conversations. (Note: Try Grok to train arguing.) 4. Meeting Facilitation Given the finding that “the adoption of AI also broadens the user’s reach in areas outside their core expertise” (Source: 1. Introduction, page 4), meeting facilitation becomes another area ripe for AI enhancement. Practitioners can use AI to generate structured agendas, provide real-time suggestions for redirecting off-track discussions, and create comprehensive summaries and action items from meeting notes. Measurable Benefits of Generative AI in Agile The P&G study quantified several benefits of AI-augmented Agile practices that directly apply: Time Efficiency: The study found that AI-enabled participants completed tasks 12-16% faster than those without AI. For Agile practitioners, this could translate to a 40-60% reduction in time spent on routine documentation.Quality Improvements: Individuals with AI showed a 0.37 standard deviation increase in quality over the control group, comparable to the quality improvement seen in two-person teams (0.24 standard deviation). This suggests a potential 35% increase in clarity and completeness for Agile artifacts.Strategic Focus: The time saved allows practitioners to redirect 2-3 additional hours per week toward strategic activities and direct team support.Team Satisfaction: The study’s finding that “professionals reported more positive emotions and fewer negative emotions when engaging with AI” (Source: 1. Introduction, page 5) suggests that AI can reduce frustration with administrative tasks, potentially improving team health metrics. Current Limitations and When Not to Use Generative AI in Agile Despite its benefits, the P&G study acknowledges that AI has limitations. As the researchers note, “Our findings suggest that adopting AI in knowledge work involves more than simply adding another tool” (Source: 1. Introduction, page 5). For Agile practitioners, understanding these limitations is crucial. AI cannot replace: Building genuine human connections and psychological safety,Understanding team dynamics and interpersonal tensions,Making value judgments that require organizational context,Facilitating difficult conversations and conflict resolution. The P&G experiment involved “one-day virtual collaborations that did not fully capture the day-to-day complexities of team interactions in organizations — such as extended coordination challenges and iterative rework cycles” (Source: 7. Discussion and Conclusion, page 20). Thus, Agile practitioners should be aware of these limitations when implementing AI in ongoing team dynamics. Warning signs that you should revert to fully human approaches: When the AI’s suggestions lack the necessary organizational context,For highly sensitive situations involving personal conflicts,When the team needs emotional support rather than logical solutions,When building trust is the primary objective. Generative AI in Agile: Enhancing Collaboration, Not Replacing It The P&G study conclusively shows that AI enhances team collaboration rather than diminishing it. The researchers observed that AI served as a “boundary-spanning mechanism, helping professionals reason across traditional domain boundaries and approach problems more holistically” (Source: 7. Discussion and Conclusion, page 21). For Agile teams, this translates to several collaboration enhancements: Meeting preparation. AI can help team members organize thoughts and contributions before meetings, leading to more productive discussions where everyone participates meaningfully.Knowledge gap bridging. The study found that “workers without deep product development experience can leverage AI’s suggestions to bridge gaps in knowledge or domain understanding” (Source: 1. Introduction, page 4). This allows less experienced team members to contribute more confidently in discussions.Idea expansion. Dell’Acqua et al. observed that “individuals using AI achieved similar levels of solution balance on their own, effectively replicating the knowledge integration typically achieved through team collaboration” (Source: 7. Discussion and Conclusion, page 21). This helps teams break out of groupthink and consider a wider solution space.Documentation burden reduction. The study showed that AI-enabled participants produced significantly longer, more detailed outputs in less time. When one person volunteers to document decisions, AI can help them create comprehensive notes quickly, avoiding the common pattern where documentation responsibilities limit active participation. Getting Started: Simple Steps When using generative AI in Agile, start with simple steps: Experiment with prompts. Use role-based framing (“Act as a Scrum Master…”) and constraints (“Propose solutions under a 2-week deadline”). The P&G study found that participants who interacted more extensively with AI through iterative prompting achieved better results.Focus on repetitive tasks. Automate standup summaries, reports, or OKR tracking. The research shows that AI provides the greatest benefits for routine, structured tasks where domain expertise can be clearly communicated.Iterate. The study demonstrated that treating AI as a collaborative partner rather than a mere tool yielded superior results. Approach AI like a junior teammate — critique its outputs, refine prompts, and integrate feedback. Common Pitfalls to Avoid When using generative AI in Agile, avoid the following anti-patterns: Excessive secrecy. The P&G researchers found that transparency about AI usage was important for team dynamics. Being secretive can create distrust; be appropriately transparent.Over-reliance. The study noted that AI sometimes produced hallucinations, highlighting the need to maintain critical thinking and not accept AI outputs without verification.Tool fixation. Dell’Acqua and others emphasized focusing on outcomes rather than becoming obsessed with the tools themselves.Ignoring team dynamics. Consider how AI adoption affects team interactions and relationships, a concern highlighted by the researchers when discussing the social implications of AI integration. Conclusion: Evolve Agility, Don’t Betray It The P&G study provides compelling evidence that AI acts as a “cybernetic teammate,” augmenting human skills, not replacing them. As Dell’Acqua et al. conclude, “By enhancing performance, bridging functional expertise, and reshaping collaboration patterns, GenAI prompts a rethinking of how organizations structure teams and individual roles” (Source: 1. Introduction, page 5). Agile’s strength lies in adaptability, and dismissing AI contradicts that principle. By embracing AI now, practitioners future-proof their careers while staying true to Agile’s core mission: delivering value faster, together. The cost of waiting is your competitive edge. Start small, experiment often, and let AI handle the mundane so you can focus on the meaningful. The most Agile approach to AI isn’t blind enthusiasm or stubborn resistance — it’s thoughtful exploration, validated learning, and continuous adaptation based on results. This approach has guided Agile practitioners through previous technological shifts and will serve you well in navigating this one. Have you started embracing AI in your daily work? Please share with us in the comments.

By Stefan Wolpers

CORE

AWS WAF Classic vs WAFV2: Features and Migration Considerations

Amazon Web Services Web Application Firewall (AWS WAF) protects web programs against widespread vulnerabilities including SQL injection and cross-scriptability. Amazon Web Services WAFV2, a new WAF Classic service, introduces with it increased agility, elasticity, and operational efficiency. In this article, we will compare WAF variants, emphasize their differences, and discuss migration guidance for WAFV2 in detail. Differences Between WAFV2 and WAF Classic FeatureAWS WAF ClassicAWS WAFV2Management of RulesRules execute directly in WebACLs.Modular sets of rules and re-use blocks allow for agility.CapacityNot a direct capacity constraint; not an efficient scaler.Works with WebACL Capacity Units (WCUs) for predictable and elastic capacity.Logs and AnalyticsBaseline logging capabilitiesJSON logging with integration in AWS Kinesis Data Firehose for deep analysis.API FeaturesStreamlined operations in the APIGranular APIs for automation and integration in a CI/CD pipeline.Managed RulesFewer managed rules.Enhanced group of managed rules with versioning in Amazon Marketplace.Custom ResponsesNot supported in WAF ClassicDefine custom HTTP messages for matched rules.Management of IP SetsBaseline configuration for sets of IPs.Integration with CIDR blocks for IPv4 and IPv6 with additional filtering granularity.Regex Pattern SetsPartial support for regex.Shared-use regex for flexible and elastic filtering. What WAFV2 Can Deliver Amazon WAFV2 introduces many improvements and is a smarter and more flexible security tool compared to WAF Classic. Its security capabilities address current security concerns and help companies secure their apps and maximize cost and performance savings. In the following sections, individual improvements in WAF V2 are discussed in detail, with an analysis of why migration is a smart investment. 1. Scalability and Flexibility AWS WAFV2 introduces WebACL Capacity Units (WCUs) for efficient configuration and rule scaling. Unlike quota and WAF Classic, WAFV2 dynamically provisions for traffic behavior, performance optimization, and cost savings. Flexible configuration creation in JSON introduces modularity and re-use of configuration sets for ease of administration. 2. Security Features Increased sets of regex patterns, sets of configuration of IP, and CIDR filtering introduce granular filtering in AWS WAFV2 for traffic. Customizable HTTP responses enable feedback to be delivered for request blocked, for security policies and positive feedback for positive user experience. 3. Advanced Logs and Analytics Format in JSON for logging, for use with CloudWatch and with AWS Kinesis Data Firehose for deep reporting and analysis, is included in AWS WAFV2. Logs can be filtered for important security events for ease in reporting and analysis for compliance requirements. 4. DevOps and Automation Granular APIs simplify integration with delivery in the form of Infrastructure as Code (IaC) with Terraform and CloudFormation. Automated workflows involve less intervention, with repeat configuration and quick delivery. 5. Managed Rules and Solutions Managed rules in AWS Marketplace differ for ease in deploying and securing for general vulnerabilities. Managed rules save custom development, with quick delivery and guaranteed security best practice compliance. 6. Cost Efficiency WCUs enable predictable pricing for use, with minimized overruns in expense. Organizations can save even more with optimized cost savings through the reuse of sets of rules and filtering logs for relevant information, with minimized expense for storing logs. Migration Considerations Migrating to WAFV2 is a planned migration, and companies will have access to new security capabilities and maximization of operational efficiency. It is preferable to migrate with a planned scheme, and in that case, downtime will not occur, security policies will not be disturbed, and configuration faults will not occur. Below is a systemic transition mechanism for a successful transition to WAFV2. Migrating to WAFV2 entails a sequence of phases, namely, evaluation, planning, rollout, testing, and optimization. Organizations first have to evaluate the present WAF Classic configuration and dependencies, then calculate a migration scheme with compatibility and performance requirements in mind. Sufficient testing in a testing environment identifies potential faults in anticipation and aids in full-fledged rollout. Monitoring and feedback loops in real-time double-verifies transition adheres to security and performance requirements. It should be as a planned migration, and companies will have access to new security capabilities and maximization of operational efficiency. It is preferable to migrate with a planned scheme, and in that case, downtime will not occur, security policies will not be disturbed, and configuration faults will not occur. What is below is a systemic transition mechanism for successful transition to WAFV2. 1. Analyze Dependencies Check current configuration, for example, WebACLs, rules, and dependencies such as distributions in Application Load Balancer (ALBs) and in CloudFront. Recording such items integrate and noting any compatibility issue with WAFV2 aids in planning for migration effectively. 2. Backup Configurations Before migration, save the AWS WAF Classic configuration and settings with AWS CLI. Store them in a secure location. It is even better to version them and make them restoration capable with zero faults. 3. Map Existing Rules to WAFV2 Go through existing rules and map them onto WAFV2 constructs. Set sets of repeatable rules wherever possible to make maintenance easy and enable scalability. Re-prioritize and retest logical constructs and rules in an effort to maintain desired behavior. Look for alternatives to use in place of custom rules with AWS Managed Rules, simplify complexity, and expand security coverages. 4. Test in Staging Environments Roll out WAFV2 settings first in a testing environment, then in production environments. Install in-place test cases and simulations in an effort to validate behavior of rules, inspect logging output, and confirm WCU requirements for desired volumes of traffic. Do performance testing in an effort to validate new rules, don't add additional latency, and simulate attack scenarios in an effort to validate protective capabilities. 5. Implement Incremental Rollout Roll out settings in phases with the least impact through blue/green methodologies for rollout. Roll out WAFV2 rules in monitor mode (Count) first in an effort to monitor the behavior of traffic and not actually stop any traffic. Roll out towards active (Block) mode following successful affirmation of correct behavior. 6. Rollback Plan Develop a rollback mechanism for immediate restoration in case of failure during migration. Have backup copies of the AWS WAF Classic configuration and validate restoration processes in preparation beforehand before proceeding with migration. Implement triggers for rollback, such as anomalous behavior in traffic or performance degradation, and automate rollback scripts to minimize downtime. Check for gaps in the rollback mechanism periodically and address them during testing. Roll out configurations in phases to minimize impact, using blue/green deployment techniques. Initially, deploy WAFV2 rules in monitor mode (Count) to observe traffic behavior without blocking it. Transition to active (Block) mode once the correct behavior is confirmed. Finally, implement rollback scenarios to address any anomalous behavior during the phased rollout. Ensure that rollback plans are thoroughly tested and refined for seamless recovery in case of unexpected issues. Conclusion AWS WAFV2 is a big improvement over WAF Classic with its feature-rich capabilities, such as flexible, elastic WebACL Capacity Units (WCUs), high-powered logging, reusable groups, and ease of integration with DevOps pipelines. Its modularity in management, with its added feature of managed rules providing predefined security policies, reduces configuration times for easier and quick deployment. When executed with a plan, the migration can allow organisations to utilise new capabilities for ease of administration, security, and cost savings. AWS WAF V2 is a cutting-edge web application tool with future-proof and automation features.

By Srinivas Chippagiri

CORE

Product Design vs Platform Design for Software Development

An enterprise architect must understand the business's need for building either a platform or a product. A product is software that has “off-the-shelf,” more generic features and functions. In contrast, a platform is a software or service that allows external parties to extend and develop complementary functions and services. Choosing the right fit for business needs is extremely important after careful consideration of factors for driving product/platform development. Platforms are more flexible in connecting two or more parties, usually called producers and consumers. In recent days, organizations have been more focused on building digital platforms or digital products based on the business vertical's needs. While anyone can build products on top of a platform, the platform itself is not the product. The program perspective difference between product and platform is important for project strategy, execution, and stakeholder management. The table below shows the difference between the product and the platform. Productplatform Users typically consume the product's features and functionality, with no scope to extend or modify the implementation. Users can extend and create new applications and services on top of the platform. Creates value from their inherent functions and futures. Facilitates interactions between applications and services. Generates revenue by directly selling the feature. Generates revenues from direct selling and selling developer tools and platform ecosystem. Also interacts with other systems. Focuses on solving specific problems for targeted business areas. Focus on creating a base structure, then others can utilize it to extend or build a new product or platform Not flexible to adapt to new futures for market changes. Flexible to adopt new futures for market changes. Product managers focus on defining roadmaps, feature prioritization, and managing releases. Platform managers focus on platform infrastructure, reliability, and scalability. More focus on core design and customer needs. More focus on API development, documentation, and scalability. Examples are Search engines, CRM products, SAP, Uber, etc. Examples are Twitter, LinkedIn, Instagram, etc.Most product designs have a long-term perspective.Designed in the view of integrations to business and infrastructure systems.Changes can't be made quickly.Changes can be adopted quickly. If any business wants the customer journey to cross all channels and markets, the architects should adopt platform thinking for developing the future of the service. This approach provides the reusability of underlying technologies and data platforms with a focus on efficiency, consistency, stability, and security. Platform teams operate like product teams. A vision and road map have been created to set the objectives of the platform team linked to business outcomes. Some organizations run product and platform teams, and the platforms can connect to the product to fetch the data the business needs. Banking organizations currently leverage disparate IT products and solutions to support their banking, financial, and operational tasks, which often contribute to data discrepancies, among other inefficiencies. Platforms may integrate data from different applications, helping to create consistency of data. In the platform approach, the end user can access the data from a single entity, i.e., a single point of truth. The platform APIs can be extended by third-party applications or other products. Nowadays, organizations are more focused on platform thinking than developing products due to the advantages of platforms, such as reduced time to market, increased developers’ productivity, reduced complexity, and improved security. Benefits of Platform Thinking An architect's mindset has to shift from product thinking to platform thinking to develop the platform in their business ecosystem. Platform thinking brings the following aspects: Individual product-to-ecosystem collaboration modelAccess to APIs for more convenience and flexibilityDecentralized orchestrationAllows other systems to build provides on top of the platformsAdopts the market changes quickly The platform development model accelerates the speed of the delivery. This is achieved by adopting best industry practices, patterns, and technology through the emphasis on reusing rather than building new code. Below are some of the guiding principles for designing the platform: Automation of processRepeatableSmall unit planning and deliverables using agileLevel of control and governanceBuild once and deploy wherever requiredSelf-service buildTraceability The design thinking framework helps the enterprise architects develop realistic and pragmatic platforms. Platform design requires discipline to learn, tailor, extend the future, and use the APIs and methods. Platforms are more focused on speed to market rather than efficiency. Architects have to adopt a balanced approach when choosing a platform or product. Architects have to decide on a platform or product design based on how big the ecosystems and maturity of the developers around the applications are, along with the stability of the APIs built around. Platform design thinking focuses more on board scope, long-term vision, and the effect of other ecosystems. On the other hand, product design thinking is more focused on less scope, short-term vision, and application-centric. Choosing the Right Model Choosing the right design model is important, and factors like industry and market demand for functions and futures, availability of resources in the organization, revenue goals, and end-user experience should be considered. Choosing between a product or platform model is a crucial decision with business implications. Each model offers unique strengths and challenges, and the right choice depends on resources, market dynamics, and business goals.

By Ravi Kiran Mallidi

CORE

Getting Started With LangChain for Beginners

Large language models (LLMs) like OpenAI’s GPT-4 and Hugging Face models are powerful, but using them effectively in applications requires more than just calling an API. LangChain is a framework that simplifies working with LLMs, enabling developers to create advanced AI applications with ease. In this article, we’ll cover: What is LangChain?How to install and set up LangChainBasic usage: Access OpenAI LLLs, LLMs on Hugging Face, Prompt Templates, ChainsA simple LangChain chatbot example What Is LangChain? LangChain is an open-source framework designed to help developers build applications powered by LLMs. It provides tools to structure LLM interactions, manage memory, integrate APIs, and create complex workflows. Benefits of LangChain Simplifies handling prompts and responsesSupports multiple LLM providers (OpenAI, Hugging Face, Anthropic, etc.)Enables memory, retrieval, and chaining multiple AI callsSupports building chatbots, agents, and AI-powered apps A Step-by-Step Guide Step 1: Installation To get started, install LangChain and OpenAI’s API package using pip, open your terminal, and run the following command: Plain Text pip install langchain langchain_openai openai Setup your API key in an environment variable: Plain Text import os os.environ["OPENAI_API_KEY"] = "your-api-key-here" Step 2: Using LangChain’s ChatOpenAI Now, let’s use OpenAI’s model to generate text. Basic Example: Generating a Response Python from langchain_openai import ChatOpenAI # Initialize the model llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.5) # write your prompt prompt = "What is LangChain ?" # print the response print(llm.invoke(prompt)) Explanation from langchain_openai import ChatOpenAI(). This imports the ChatOpenAI class from the langchain_openai package and allows to use OpenAI’s GPT-based models for conversational AI.ChatOpenAI(). This initializes the GPT model.model ="gpt-3.5-turbo". As Open AI has several models to use, we have to pass the model that we want to use for prompt response. However, by default, Open AI uses the text-davinci-003 model.temperaure=0.5. ChatOpenAI is initialized with a temperature of 0.5. Temperature controls randomness in the response: 0.0: Deterministic (always returns the same output for the same input).0.7: More creative/random responses.1.0: Highly random and unpredictable responses.Since temperature = 0.5, it balances between creativity and reliability.prompt = "What is LangChain?". Here, we are defining the prompt, which is coming from LangChain and will be sent to the ChatOpenAI model for processing.llm.invoke(prompt). This sends the prompt to the given LLM and gets the response. Step 3: Using Other LLM Models Using HuggingFacePipeline Python from langchain_huggingface import HuggingFacePipeline # Initialize the model, here are trying to use this model - google/flan-t5-base llm = HuggingFacePipeline.from_model_id( model_id="google/flan-t5-base", task="text-generation", pipeline_kwargs={"max_new_tokens": 200, "temperature" :0.1}, ) # print the response print(llm.invoke("What is Deep Learning?")) # In summary here we learned about using different llm using langchain, # instead OpenAI we used a model on Huggingface. # This helps us to interact with models uploaded by community. Step 4: Chaining Prompts With LLMs LangChain lets you connect prompts and models into chains. Python # Prompts template and chaining using langchain from langchain.prompts import PromptTemplate from langchain_openai import ChatOpenAI llm = ChatOpenAI(model_name="gpt-4o",temperature=0.9) # Prompt Template - let you generate prompts that accepts variable, # we can have multiple variables as well template = "What is the impact on my health, if I eat {food} and drink {drink}?" prompt = PromptTemplate.from_template(template) # Here chains comes into picture to go beyond single llm call # and involve sequence of llm calls, and chains llms and prompt togetger # Now we initialize our chain with prompt and llm model reference chain = prompt | llm # here we are invok the chain with food parameter as Bread and drink parameter as wine. print(chain.invoke({"food" : "Bread","drink":"wine"})) Why Use LangChain? Automates the process of formatting promptsHelps in multi-step workflowsMakes code modular and scalable Step 5: Chain Multiple Tasks in a Sequence LangChain chains allow to combine multiple chains, where output from first chain can be used as a input to second chain. Python from langchain_core.output_parsers import StrOutputParser from langchain_openai import ChatOpenAI from langchain.prompts import PromptTemplate llm = ChatOpenAI(model_name="gpt-4o", temperature=0) #first template and chain template = "Which is the most {adjectective} building in the world ?" prompt = PromptTemplate.from_template(template) chain = prompt | llm | StrOutputParser() #second template and chain with the first chain template_second = "Tell me more about the {building}?" prompt_second = PromptTemplate.from_template(template_second) chain_second = {"noun" : chain} | prompt_second | llm | StrOutputParser() #invoking the chains of calls passing the value to chain 1 parameter print(chain_second.invoke({"adjectective" : "famous"})) Why Use Sequential Chains? Merges various chains by using the output of one chain as the input for the next.Operates by executing a series of chainsCreating a seamless flow of processes Step 5: Adding Memory (Chatbot Example) Want your chatbot to remember past conversations? LangChain Memory helps! Python from langchain.chains import ConversationChain from langchain.memory import ConversationBufferMemory from langchain_openai import ChatOpenAI # Initialize model with memory llm = ChatOpenAI(model="gpt-3.5-turbo") memory = ConversationBufferMemory() # Create a conversation chain conversation = ConversationChain(llm=llm, memory=memory) # Start chatting! print(conversation.invoke("Hello! How is weather today ?")["response"]) print(conversation.invoke("Can I go out for biking today ?")["response"]) Why Use Memory? Enables AI to remember past inputsCreates a more interactive chatbotSupports multiple types of memory (buffer, summarization, vector, etc.) What’s Next? Here, we have explored some basic components of LangChain. Next, we will explore the below items to use the real power of LangChain: Explore LangChain agents for AI-driven decision-makingImplement retrieval-augmented generation (RAG) to fetch real-time data

By Amit Chaudhary

Challenges of Using LLMs in Production: Constraints, Hallucinations, and Guardrails

Large language models (LLMs) have risen in popularity after the release of Chat-GPT. These pre-trained foundation models enable rapid prototyping, and companies want to use this cool technology. However, their probabilistic nature and lack of built-in constraints often lead to challenges once they are out of prototyping mode. Let us consider an example of classifying news articles based on the content in the article to discuss the challenges one would encounter. Current LLMs have issues such as non-adherence to instructions, hallucinations, and possibly spitting out something that you don't want. This article explores these challenges with an example of classifying news articles into categories based on the content in the article and offers actionable strategies to mitigate them. Challenge 1: Constraint Adherence in Outputs Problem: Uncontrolled Category Generation When classifying news articles into categories, LLMs may categorize them into a large list of categories, making categorization ineffective. They may categorize an article as Sports and another similar article related to sports as entertainment. This could result in a large list of categories. Initial Solution: Predefined Labels and "Others" Buckets A common solution is to restrict outputs to a predefined list such as "Sports" or "Entertainment," with an "Others" category for all articles that cannot be categorized in any predefined categories. This can be addressed by using prompt engineering, which is the process of designing and refining inputs to guide LLMs in producing the desired output. In this example, the prompt can be updated to generate output by choosing a value from the predefined list of categories. While this may work in small tests, at scale, this could result in intermittent results not honoring the instructions provided in the prompts. LLMs may categorize articles as "Political Science" despite explicit instructions to choose from predefined categories. This undermines consistency, especially in systems relying on fixed taxonomies. Also, the "Others" category bucket often balloons due to: Ambiguity. Articles may have overlapping multiple categories.Model uncertainty. The model may have low confidence in some categories, so it is forced to make category choices.Edge cases. Some novel topics may not be covered by existing categories. Improved Approach: Validation Layers in Post-Processing Steps Instead of relying solely on prompts, implement a two-tiered validation system: Use a combination of deterministic and probabilistic post-processing. Use lookup tables to verify if the generated output honors the constraints. Resend the same request to the LLM again if the response does not honor the constraint, and discard the results if the response in the second attempt also does not honor the constraint. With good prompt engineering and this two-tiered post-processing, the occurrences of results not honoring the constraints would drop significantly. This reduces over-reliance on prompt engineering to enforce constraints and ensures higher accuracy. Challenge 2: Grounding Outputs in Truth Problem: Hallucinations and Fabrications LLMs lack intrinsic knowledge of ground truth, resulting in fabricating answers instead of acknowledging that they don’t know the answer. For instance, when classifying scientific articles, models might mislabel speculative content as peer-reviewed based on linguistic patterns alone. Solution: Augment With Retrieval-Augmented Generation (RAG) Retrieval-augmented generation (RAG) is the process of combining a user’s prompt with relevant external information to form a new, expanded prompt for an LLM. Giving the LLM all the information it needs to answer a question enables it to provide answers about topics it was not trained on and reduces the likelihood of hallucinations. An effective RAG solution must be able to find information relevant to the user’s prompt and supply it to the LLM. Vector search is the most commonly used approach for finding relevant data to be provided in the prompt to the model. Integrate RAG to anchor outputs in verified data: Step 1: Retrieve relevant context (e.g., a database of known peer-reviewed journal names or author credentials).Step 2: Prompt the LLM to cross-reference classifications with retrieved data. This forces the model to align outputs with trusted sources, reducing hallucinations. Challenge 3: Filtering Undesirable Content Problem: Toxic or Sensitive Outputs Even "safe" LLMs can generate harmful content or leak sensitive data from inputs (e.g., personal identifiers in healthcare articles). LLMs have in-built controls to prevent this, and these controls vary from model to model. Having guardrails outside of the model will help address gaps that models may have, and these guardrails can be used with any LLM. Solution: Layered Guardrails Input sanitization. Anonymize or scrub sensitive data (e.g., credit card numbers) in inputs before providing it to the model.Output sanitization. Sanitize the output from the model to remove toxic phrases or sensitive informationAudit trails. Log all inputs/outputs for compliance reviews. Most hyperscalers provide services that can be used for data sanitization. For example, Amazon Bedrock Guardrails can implement safeguards for your generative AI applications based on your use cases and responsible AI policies. Score the Results Let the model be its own critic. For this example, force the model to provide reasoning for each label it assigns. Provide this as input to the same model or a different one to provide a score on a pre-defined scale. Monitor this metric to understand the consistency and accuracy of the model. This score can be used to discard a label assigned if the score is less than a pre-defined value and re-run the analysis to generate a new label. This score can also be used to run A/B tests to experiment with multiple prompts Best Practices for Production-Grade LLM Systems Multi-layered validation. Combine prompt engineering, post-processing, and scoring the results to validate the generated results.Domain-specific grounding. Use RAG for factual accuracy to reduce the hallucination frequency of the model.Guard rails and continuous monitoring. Track metrics like Others rate in this example, results quality score, and guardrail services to provide production-ready services. Conclusion Developers can move LLMs from prototype to production by implementing post-processing validation, RAG, and monitoring to manage constraints, hallucinations, and safety.

By Shiva Pati

Why Mocking Sucks

Developers love mocking. It’s been a go-to solution for years: Simulate external services, run tests faster, and avoid the overhead of real APIs. But here’s the truth — mocking is overused and often dangerous. It deceives you into thinking your system is stable, hiding critical failures that only appear when your code hits the real world. APIs change. Rate limits throttle you. Authentication flows break. Your tests will still pass with green colors while your app crashes in production. Overusing mocks can turn your tests into a house of cards, giving you false confidence and creating hidden technical debt that slows teams down. In complex workflows — like payments or authentication — mocks fail to capture the true behavior of interconnected services. In this post, we pull back the curtain on the dangers of mocking, showing you when it works, when it fails miserably, and why relying on mock services can build technical debt faster. With real-world examples like Stripe and Auth0, you’ll see how mocking can backfire and why using real dev versions of services often leads to more robust software. Why Is Mocking Necessary? Mocking solves problems that often arise in modern software development, mainly when dealing with multi-tenant SaaS platforms or distributed systems. If you’re working with downloadable or offline software, mocks may not be as critical since your dependencies are within your control. However, mocking can become necessary when your application relies on third-party services — especially APIs you don’t own. Here’s why you might need to mock in specific scenarios: Testing failure scenarios. How can you simulate an API outage, rate limiting, or an error response like 500 Internal Server Error? With mocking, you can control responses to have confidence in how your application behaves under failure conditions.Resolving latency issues in tests. External services introduce latency. For example, if you’re testing customer registration through an external API, even a 500 ms response time can add up across hundreds of tests. Mocking replaces real service calls with near-instant simulated responses, allowing tests to run quickly.Simulating external services that aren’t ready. Backend APIs or third-party integrations may not be fully available during development in many projects. Mocking helps teams continue their work by simulating those services before they’re ready. When Mocking Works Well: Simple Service Simulations Mocking works best when simulating simple, isolated services with predictable behavior. For example, mocking is an excellent option if your app integrates with Stripe and you only need to test customer registration. You can simulate a successful customer registration call or even an API failure to verify your error-handling code — all without ever hitting Stripe’s servers. Python from unittest.mock import patch @patch('requests.post') def test_successful_registration(mock_post): # Mock the API response mock_post.return_value.status_code = 201 mock_post.return_value.json.return_value = { "id": "cus_12345", "name": "Test User", "email": "[email protected]" } # Call the function being tested result = register_customer("Test User", "[email protected]") # Verify the mock behavior and response assert result["id"] == "cus_12345" assert result["name"] == "Test User" assert result["email"] == "[email protected]" mock_post.assert_called_once_with( "https://api.stripe.com/v1/customers", data={"name": "Test User", "email": "[email protected]"} ) However, this approach falls apart when your workflow spans multiple services. Imagine testing a full Stripe payment flow: Registering a customer, adding items to a cart, and processing a payment. Mocking each step might seem feasible, but once you combine them, inter-service dependencies, timing issues, and API quirks won’t surface in your tests. Accurately testing complex workflows is especially critical for applications that use third-party services for authentication. For example, let’s say you are using Auth0 to manage authentication. Mocking here is risky because authentication is mission-critical, and updates can make your mocks obsolete, breaking your app in production. Worse, authentication failures can shatter user trust, leading to frustration, account lockouts, or even security vulnerabilities. When Mocking Sucks and Why Revisiting the Stripe example, maintaining mocks for the full simulated flow requires constant updates to match API changes, introduces inconsistencies, and fails to mimic the nuances of real-world interactions. Here are the issues: 1. Mocking Creates A False Sense Of Security Mocks only behave the way you program them to. They’ll never catch unexpected changes or errors that might occur with the real service, giving you the illusion that your system is working perfectly. Even worse, mocks can accidentally break your product by masking breaking changes. Imagine a situation where a third-party API modifies a key response format. If your mocks aren’t updated to reflect this change, your tests will continue to pass while your product experiences hidden failures in production. This false confidence leads to missed bugs, broken functionality, and a potentially massive impact on your users and business. 2. Mocking Increases Maintenance Overhead APIs evolve. New fields, endpoints, and even minor response tweaks can break your mocks. You’ll constantly need to update your test suite to keep up with these changes, which can result in technical debt that burdens your team. 3. Mocking Encourages Bad Testing Practices Developers often become complacent with mocks, focusing more on matching expected inputs and outputs than handling real-world edge cases. This leads to over-reliance on happy-path tests that fail to account for errors, latency, or timeouts in real environments. 4. Mocking Decouples You from Reality Mocks can’t reproduce the unpredictability of real services — rate limiting, version mismatches, or complex state changes in multi-tenant APIs. Tests that never hit real endpoints miss these critical factors, resulting in software unprepared for real-world conditions. 5. Mocking Is An Anti-Pattern For Complex Systems The more interconnected your services are, the harder it becomes to maintain accurate mocks. In a distributed system, service interactions are dynamic and often undocumented, meaning mocks will never fully reflect actual behavior. Over time, this leads to tests that become brittle and unreliable. 6. Mocking Hinders Real Developer Experience Developers often miss opportunities to work with real APIs early in development due to over-reliance on mocks. This delays the discovery of integration issues, ultimately shifting the pain to later stages, like QA or production. So What Is The Solution? In some cases, mocking may be your only option. When you do, use company-maintained mocks whenever possible, like Stripe’s stripe-mock, which stays in sync with their API and minimizes maintenance overhead. However, even the best mocks can’t replace the benefits of using sandbox or dev environments provided by real services. Use sandbox APIs to run realistic integration tests, but be prepared to face latency, rate limits, and downtime. These issues can disrupt your tests and waste time. Local-First vs. Mock-Driven Development This “local-first” development approach aligns with broader trends in modern software engineering. Developers are increasingly favoring real, self-contained environments over artificial mocks. Tools like Docker, Kubernetes, and local microservice setups have empowered teams to replicate production-like conditions at every stage of development. The idea is simple: The more your tests reflect reality, the fewer issues you’ll face when deploying to production. Mocks can still be helpful for specific, isolated tests, but local-first is the future for complex, business-critical systems like authentication. Here is a table summarizing the differences between mock-driven and local-first development: CategoryMock-Driven DevelopmentLocal-First DevelopmentMaintenanceRequires constant updates to stay in sync with evolving APIs.Minimal maintenance; tested component up to date with production behavior.ReliabilityMocks can mask breaking changes and hidden errors.Real services expose real-world issues at dev time.Developer ExperienceDelays integration issue discovery until QA or production.Developers catch and fix integration issues early in development. But Isn’t Prod Always Going To Be Different Than Dev? Your production services will always be different from dev in an operational sense; they’ll have more load and more data if nothing else. But by using local-first development rather than mocks, you can keep developers closer to the reality of production, discovering issues sooner and reducing test maintenance. Real-World Examples Of Local-First Development Tools The trend is clear: It’s time to embrace local-first development to provide developers with reliable, production-grade environments they can run locally. By offering real services for development and testing, this approach empowers teams to build with greater confidence and fewer surprises in production. Firebase Emulator Suite Firebase, widely used for authentication, real-time databases, and cloud functions, offers a local emulator suite. The tool allows developers to simulate core Firebase services in their development environment, removing the need for fragile mocks. You can test real authentication flows, database queries, and cloud function triggers without depending on the live Firebase servers.The emulator provides feature parity with production, allowing reliable integration tests free from rate limits and connectivity issues. Let’s see how mocking a Firebase service and using the Firebase Emulator Suite differ. A common approach to testing Firebase authentication and Firestore interactions is to mock the Firebase Admin SDK. Below is an example of how developers typically do this in Python. Python from unittest.mock import patch, MagicMock import firebase_admin from firebase_admin import auth, firestore, credentials # Initialize Firebase app (mocked in tests) cred = credentials.Certificate("./service_account_key_example.json") firebase_admin.initialize_app(cred) # Function to authenticate a user and retrieve an ID token def authenticate_user(uid): user = auth.get_user(uid) token = auth.create_custom_token(uid) return {"uid": user.uid, "token": token} # Function to fetch a document from Firestore def get_user_data(uid): db = firestore.client() doc_ref = db.collection("users").document(uid) doc = doc_ref.get() return doc.to_dict() if doc.exists else None # Unit test with mocks @patch("firebase_admin.auth.get_user") @patch("firebase_admin.auth.create_custom_token") @patch("firebase_admin.firestore.client") def test_authenticate_user(mock_firestore, mock_create_token, mock_get_user): # Mock user data mock_get_user.return_value = MagicMock(uid="12345") mock_create_token.return_value = "fake-token" result = authenticate_user("12345") assert result["uid"] == "12345" assert result["token"] == "fake-token" @patch("firebase_admin.firestore.client") def test_get_user_data(mock_firestore): # Mock Firestore response mock_doc = MagicMock() mock_doc.exists = True mock_doc.to_dict.return_value = {"name": "John Doe", "email": "[email protected]"} mock_firestore.return_value.collection.return_value.document.return_value.get.return_value = mock_doc result = get_user_data("12345") assert result["name"] == "John Doe" assert result["email"] == "[email protected]" While this approach allows for isolated testing, it introduces several problems: Mocks do not capture Firebase behavior changes, such as new API parameters or modified authentication flows.Firestore queries in production may behave differently from their mocked versions, especially when dealing with security rules and indexes.The real Firebase service enforces rate limits and authentication constraints, which mocks ignore. So, instead of mocking Firebase, you can use the Firebase Emulator Suite to run tests against a fully functional local Firebase instance, which behaves identically to production. Step 1: Install And Configure Firebase Emulator Install the Firebase CLI. Plain Text npm install -g firebase-tools Initialize the project. Plain Text firebase init Initialize Firebase Emulators. Plain Text firebase init emulators Start the emulator. Plain Text firebase emulators:start Step 2: Modify Your Code To Use The Emulator With Firebase running locally, you can modify the authentication and Firestore functions to connect to the emulator instead of mocking API calls. Plain Text import firebase_admin from firebase_admin import auth, firestore, credentials # Connect to Firebase Emulator cred = credentials.Certificate("./service_account_key_example.json") firebase_admin.initialize_app(cred, { "projectId": "demo-project" }) # Set Firebase Emulator URLs import os os.environ["FIRESTORE_EMULATOR_HOST"] = "localhost:8080" os.environ["FIREBASE_AUTH_EMULATOR_HOST"] = "localhost:9099" # Function to authenticate a user using the local Firebase Emulator def authenticate_user_emulator(uid): user = auth.get_user(uid) token = auth.create_custom_token(uid) return {"uid": user.uid, "token": token} # Function to fetch user data from Firestore in the Emulator def get_user_data_emulator(uid): db = firestore.client() doc_ref = db.collection("users").document(uid) doc = doc_ref.get() return doc.to_dict() if doc.exists else None # Integration test with the Firebase Emulator def test_firebase_emulator(): # Create test user try: user = auth.create_user(uid="test-user", email="[email protected]", password="password123") assert user.uid == "test-user" except firebase_admin._auth_utils.UidAlreadyExistsError: pass # Authenticate and get a token result = authenticate_user_emulator("test-user") assert "token" in result # Write user data to Firestore db = firestore.client() db.collection("users").document("test-user").set({"name": "John Doe", "email": "[email protected]"}) # Retrieve user data user_data = get_user_data_emulator("test-user") assert user_data["name"] == "John Doe" assert user_data["email"] == "[email protected]" test_firebase_emulator() The tests can reflect production-like conditions, catching issues like authentication changes, security rule enforcement, and API updates. FusionAuth Kickstart FusionAuth Kickstart allows you to build a template that replicates a development or production environment. Developers can spin up a local FusionAuth instance to test full authentication flows, including OAuth and SSO, ensuring tests align with production.Kickstart automates environment setup, account creation, and configuration of any resources. Then you have a FusionAuth instance with available data.Unlike mocks, this approach handles real-world complexities like security updates and multi-step flows, reducing the risk of surprises in production. To see how effective a real dev server can be, let’s write a mock that simulates a login on the FusionAuth API. Here’s how you might mock the login in Python: Python import requests from unittest.mock import patch import unittest def fusionauth_login(login_id, password, application_id, base_url="https://sandbox.fusionauth.io"): url = f"{base_url}/api/login" headers = {"Content-Type": "application/json"} data = { "loginId": login_id, "password": password, "applicationId": application_id } response = requests.post(url, json=data, headers=headers) if response.status_code == 200: return {"status": "success", "token": response.json().get("token")} elif response.status_code == 404: return {"status": "error", "message": "User not found or incorrect password"} elif response.status_code == 423: return {"status": "error", "message": "User account is locked"} else: return {"status": "error", "message": "Unknown error"} class TestFusionAuthLogin(unittest.TestCase): @patch("requests.post") def test_successful_login(self, mock_post): mock_post.return_value.status_code = 200 mock_post.return_value.json.return_value = { "token": "fake-jwt-token", "user": {"id": "12345", "email": "[email protected]"} } result = fusionauth_login("[email protected]", "correct-password", "app-123") self.assertEqual(result["status"], "success") self.assertIn("token", result) @patch("requests.post") def test_invalid_credentials(self, mock_post): mock_post.return_value.status_code = 404 mock_post.return_value.json.return_value = {} result = fusionauth_login("[email protected]", "wrong-password", "app-123") self.assertEqual(result["status"], "error") self.assertEqual(result["message"], "User not found or incorrect password") @patch("requests.post") def test_unknown_error(self, mock_post): mock_post.return_value.status_code = 500 mock_post.return_value.json.return_value = {} result = fusionauth_login("[email protected]", "password", "app-123") self.assertEqual(result["status"], "error") self.assertEqual(result["message"], "Unknown error") if __name__ == "__main__": unittest.main() The fusionauth_login function simulates a login request to FusionAuth’s /api/login endpoint. It handles various responses — success, incorrect credentials, locked accounts, and unexpected errors. The unit tests use unittest.mock.patch to replace real API calls, ensuring tests pass without needing a live FusionAuth server. But this approach doesn’t scale. For every new scenario, another mock is needed. More tests mean more mocks, more maintenance, and more fragile tests. What starts as a simple test suite quickly turns into a tangled web of artificial responses, each one detached from reality. And when FusionAuth updates? Your mocks stay frozen in time. New fields, changed response structures, evolving authentication flows — none of these are reflected in your tests. The mocks keep passing, but your application is broken in production. The alternative? Run a real FusionAuth instance locally. Tip: The easiest way to run FusionAuth is in a Docker container. Clone the fusionauth-example-docker-compose GitHub repository. Open a terminal in the light subdirectory, and run docker compose up in a terminal to start FusionAuth. Log in at http://localhost:9011 with [email protected] and password. The example repository will use the sample Kickstart file to configure the FusionAuth instance. Below is an outline of the steps you can use to set up a FusionAuth instance with Kickstart and Docker Compose, as configured in the fusionauth-example-docker-compose repository, to test authentication without mocks. Step 1: Create The Kickstart File A Kickstart file is a JSON file containing instructions for setting up an environment. To create a test case that registers a user, the Kickstart file might look like the example below. Python { "variables": { "apiKey": "33052c8a-c283-4e96-9d2a-eb1215c69f8f-not-for-prod", "asymmetricKeyId": "#{UUID()}", "applicationId": "e9fdb985-9173-4e01-9d73-ac2d60d1dc8e", "clientSecret": "super-secret-secret-that-should-be-regenerated-for-production", "defaultTenantId": "d7d09513-a3f5-401c-9685-34ab6c552453", "adminEmail": "[email protected]", "adminPassword": "password", "adminUserId": "00000000-0000-0000-0000-000000000001", "userEmail": "[email protected]", "userPassword": "password", "userUserId": "00000000-0000-0000-0000-111111111111" }, "apiKeys": [ { "key": "#{apiKey}", "description": "Unrestricted API key" } ], "requests": [ { "method": "POST", "url": "/api/key/generate/#{asymmetricKeyId}", "tenantId": "#{defaultTenantId}", "body": { "key": { "algorithm": "RS256", "name": "For example app", "length": 2048 } } }, { "method": "POST", "url": "/api/application/#{applicationId}", "tenantId": "#{defaultTenantId}", "body": { "application": { "name": "Example App", "oauthConfiguration": { "authorizedRedirectURLs": [ "https://fusionauth.io" ], "logoutURL": "https://fusionauth.io", "clientSecret": "#{clientSecret}", "enabledGrants": [ "authorization_code", "refresh_token" ], "generateRefreshTokens": true, "requireRegistration": true }, "jwtConfiguration": { "enabled": true, "accessTokenKeyId": "#{asymmetricKeyId}", "idTokenKeyId": "#{asymmetricKeyId}" } } } }, { "method": "POST", "url": "/api/user/registration/#{adminUserId}", "body": { "registration": { "applicationId": "#{FUSIONAUTH_APPLICATION_ID}", "roles": [ "admin" ] }, "roles": [ "admin" ], "skipRegistrationVerification": true, "user": { "birthDate": "1981-06-04", "data": { "favoriteColor": "chartreuse" }, "email": "#{adminEmail}", "firstName": "Erlich", "lastName": "Bachman", "password": "#{adminPassword}", "imageUrl": "//www.gravatar.com/avatar/5e7d99e498980b4759650d07fb0f44e2" } } }, { "method": "POST", "url": "/api/user/registration/#{userUserId}", "body": { "user": { "birthDate": "1985-11-23", "email": "#{userEmail}", "firstName": "Richard", "lastName": "Flintstone", "password": "#{userPassword}" }, "registration": { "applicationId": "#{applicationId}", "data": { "favoriteColor": "turquoise" } } } } ] } This code declares variables to avoid repetition, and then defines an API key. It then executes a series of requests to: Create a new application.Create a new user.Register the user to the application. Step 2: Download The Docker Compose And Environment Files Run the following commands to download the required files: Shell curl -o docker-compose.yml https://raw.githubusercontent.com/FusionAuth/fusionauth-containers/main/docker/fusionauth/docker-compose.yml curl -o .env https://raw.githubusercontent.com/FusionAuth/fusionauth-containers/main/docker/fusionauth/.env Edit these files to match your environment. In the .env file, modify DATABASE_PASSWORD and ensure the POSTGRES_USER and POSTGRES_PASSWORD are set correctly. Add the FUSIONAUTH_APP_KICKSTART_FILE environment variable with the path to the Kickstart file, and mount the Kickstart directory as a volume on the fusionauth service. Step 3: Start The FusionAuth Containers Run the following commands to bring up the services: Plain Text docker compose up This command starts three services: db - A PostgreSQL instance to store your data.search- An OpenSearch instance for advanced search features.fusionauth - The main application handling authentication flows. Step 4: Explore And Configure FusionAuth FusionAuth will now be accessible at http://localhost:9011. You can configure your application, manage users, and integrate authentication workflows from here. The setup uses OpenSearch by default, but you can modify the docker-compose.yml file to switch between different search engines if needed. Step 5: Customize The Services You can further customize the services by editing the docker-compose.yml and .env files. FusionAuth supports various deployment methods (for example, Kubernetes and Helm), making it adaptable to different environments. Learn more about working in a development environment in the development documentation. Now, you can rewrite the tests without mocking the FusionAuth API. Python import requests BASE_URL = "http://localhost:9011" APPLICATION_ID = "e9fdb985-9173-4e01-9d73-ac2d60d1dc8e" API_KEY = "33052c8a-c283-4e96-9d2a-eb1215c69f8f-not-for-prod" def fusionauth_login(login_id, password, application_id): """ Function to authenticate a user against FusionAuth. """ url = f"{BASE_URL}/api/login" headers = { "Authorization": API_KEY, "Content-Type": "application/json" } data = { "loginId": login_id, "password": password, "applicationId": application_id } response = requests.post(url, json=data, headers=headers) if response.status_code == 200: return {"status": "success", "token": response.json().get("token")} elif response.status_code == 404: return {"status": "error", "message": "User not found or incorrect password"} else: return {"status": "error", "message": f"Unknown error: {response.text}"} def test_successful_login(): """ Test successful authentication with correct credentials. The user must exist in FusionAuth before running this test. """ result = fusionauth_login("[email protected]", "password", APPLICATION_ID) assert result["status"] == "success" assert "token" in result def test_invalid_credentials(): """ Test authentication with invalid credentials. """ result = fusionauth_login("[email protected]", "wrong-password", APPLICATION_ID) assert result["status"] == "error" assert result["message"] == "User not found or incorrect password" def test_unknown_error(): """ Test handling of unknown errors by using an invalid application ID. """ result = fusionauth_login("[email protected]", "password", "INVALID_APP_ID") assert result["status"] == "error" assert "Unknown error" in result["message"] Integration tests run against a real authentication service within the same network as the application, replicating production-like conditions. The authentication flow is validated end-to-end, ensuring that security headers, and real response times are properly accounted for before deployment. Final Thoughts: Stop Relying On Mocks Mocking is risky due to the fact that it does not capture a realistic environment. That is why you should use real dev versions of services for: Better production alignment. Your tests reflect real-world conditions, including API changes, security updates, and unexpected behavior.Lower maintenance. Real services stay consistent with production, eliminating the need for constant mock updates and reducing technical debt.Accurate testing. Complex workflows like authentication, payments, or multi-step integrations behave correctly, uncovering edge cases early.Developer confidence. With realistic tests, you ship features knowing they’ll perform reliably in production.

By Andy Pai

Using Terraform Moved Block to Refactor Resources

Terraform introduced the moved block in version 1.1.0. This block provides a straightforward way to refactor resources by explicitly mapping old resource addresses to new ones. It significantly reduces the risk of losing state or manually managing imports during renames or moves. In this article, we’ll explain what the moved block is and how to use it to streamline resource restructuring for smoother and safer Terraform updates. What Is the Terraform Moved Block? A Terraform moved block is used within a module’s moved section to declare the migration of a resource or data source from one address to another. It is helpful for refactoring Terraform code while ensuring that state information is preserved, preventing Terraform from destroying and recreating resources unnecessarily. The moved block also handles resource renaming or movement across modules, making state management seamless. The syntax of the Terraform moved block is as follows: Plain Text moved { from = "<old_address>" to = "<new_address>" } Where: from is the original address of the resource or module in the previous configurationto is the new address of the resource or module in the updated configuration When to Use the Terraform Moved Block? The moved block in Terraform is used when you need to refactor or rename resources in your configuration without destroying and recreating them. It can be useful for: Renaming resources: When you change the name of a resource within the configuration (e.g., aws_instance.old_name to aws_instance.new_name).Reorganizing modules: If a resource is moved between modules or changed from a root-level resource to a module-managed resource (or vice versa), you can use the moved block to keep Terraform aware of the transition.Refactoring module names: When you are renaming or reorganizing modules to align with new naming conventions or standards.Changing resource block types: If you change the type of a resource block (e.g., from google_compute_instance to google_compute_engine), the moved block maps the old resource to the new type. Note that this is possible only if the configurations are compatible and the resource’s state remains valid. Splitting or consolidating configurations: When you split a configuration into multiple files or modules or consolidate it into fewer files, use the moved block to reflect these changes without resource re-creation. How to Use the Terraform Moved Block Let’s consider some examples: Example 1: Renaming a Resource In this example, we will rename a Terraform resource from aws_instance.old_name to aws_instance.new_name. Plain Text # Old Configuration # This block is no longer present resource "aws_instance" "old_name" { ami = "ami-12345678" instance_type = "t2.micro" } # New Configuration resource "aws_instance" "new_name" { ami = "ami-12345678" instance_type = "t2.micro" } # Moved Block moved { from = "aws_instance.old_name" to = "aws_instance.new_name" } Here, the moved block tells Terraform that the resource aws_instance.old_name has been renamed to aws_instance.new_name. Note: After making these changes, make sure you run terraform plan to verify that Terraform recognizes the migration correctly and that no unnecessary changes will be applied. Example 2: Moving Resources Between Modules In the next example, a resource is moved from the root module to a nested module. We want the aws_s3_bucket.my_bucket resource to be moved from the root module to a module named storage: Plain Text # Old Configuration (Root Module) # This block is no longer present resource "aws_s3_bucket" "my_bucket" { bucket = "example-bucket" } # New Configuration (Nested Module) # modules/storage/main.tf resource "aws_s3_bucket" "my_bucket" { bucket = "example-bucket" } # Root Module Configuration module "storage" { source = "./modules/storage" } # Moved Block moved { from = "aws_s3_bucket.my_bucket" to = "module.storage.aws_s3_bucket.my_bucket" } The moved block updates the Terraform state to reflect the new location without destroying and recreating the resource. Example 3: Using Moved With for_each We will now move a resource to a new name using a for_each loop, which creates multiple resources dynamically. Plain Text # Old Configuration # This block is no longer present resource "aws_security_group" "old_group" { for_each = toset(["web", "db"]) name = "sg-${each.key}" } # New Configuration resource "aws_security_group" "new_group" { for_each = toset(["web", "db"]) name = "sg-${each.key}" } # Moved Block moved { from = "aws_security_group.old_group[\"web\"]" to = "aws_security_group.new_group[\"web\"]" } moved { from = "aws_security_group.old_group[\"db\"]" to = "aws_security_group.new_group[\"db\"]" } Each resource in the old_group is mapped to a corresponding resource in new_group using moved blocks. This ensures a smooth state migration for all instances of the resources without unnecessary recreation. Example 4: Moving a Resource With a Changed Identifier In this example, a resource’s identifier changes because the for_each key expression is modified. We want to rename the for_each keys from ["app", "db"] to ["frontend", "database"]. Without the moved block, Terraform would see these as new resources and destroy the old ones. Plain Text # Old Configuration resource "aws_security_group" "sg" { for_each = toset(["app", "db"]) name = "sg-${each.key}" } # New Configuration resource "aws_security_group" "sg" { for_each = toset(["frontend", "database"]) # Changed identifiers name = "sg-${each.key}" } # Moved Blocks moved { from = "aws_security_group.sg[\"app\"]" to = "aws_security_group.sg[\"frontend\"]" } moved { from = "aws_security_group.sg[\"db\"]" to = "aws_security_group.sg[\"database\"]" } Each moved block maps the old key (app or db) to the new key (frontend or database). When you run terraform plan and terraform apply, Terraform updates the state to match the new identifiers, avoiding unnecessary destruction and re-creation of the security groups. Example 5: Moving a Resource Between Providers If we want to migrate a resource from one provider to another (e.g., from aws to aws.other_region), the Terraform moved block ensures a seamless state transition. In our example, the S3 bucket resource is initially in the us-west-1 region using the default aws provider. We want to manage the same bucket in a different AWS region (e.g., us-east-1) using a different provider alias. Plain Text # Old Configuration provider "aws" { region = "us-west-1" } resource "aws_s3_bucket" "example_bucket" { bucket = "example-bucket" } # New Configuration provider "aws" { region = "us-west-1" } provider "aws" { alias = "other_region" region = "us-east-1" } resource "aws_s3_bucket" "example_bucket" { provider = aws.other_region bucket = "example-bucket" } # Moved Block moved { from = "aws_s3_bucket.example_bucket" to = "aws_s3_bucket.example_bucket" } The moved block ensures Terraform updates the state to associate the resource with the new provider configuration (aws.other_region). The bucket itself remains intact in AWS and is not recreated, avoiding downtime or data loss. Terraform Moved Block Limitations The moved block in Terraform is useful for handling resource renaming or moving across modules in a safe and structured manner, but it has some limitations: Manual specification. The moved block requires you to manually specify both the source and destination addresses, which can lead to human error if the information is not provided accurately.Resource renames only. The moved block is specifically designed for renaming or moving resources within the state. Splitting or restructuring state might require manual state modifications or the use of tools like terraform state mv.Static declaration. The moved block is static and does not support conditional logic. It cannot handle scenarios in which the movement depends on dynamic conditions.Limited to planned changes. The moved block is only applied during the terraform plan and apply commands. If you modify the state file manually, the moved block will not help reconcile those changes.Requires state compatibility. The source and destination must be part of the same state file. This is particularly important for users working with remote backends.Only supported in Terraform 1.1+. Older versions of Terraform do not recognize the moved block, making it incompatible with legacy configurations or environments. Key Points The moved block in Terraform is a practical tool for restructuring resources without disrupting infrastructure. It ensures seamless state updates, avoids downtime, and simplifies complex refactoring tasks, allowing you to reorganize code confidently and maintain operational stability — provided the mappings are accurate and proper planning is in place.

By Mariusz Michalowski

Data Engineering

Functions of Data Engineering

AI/ML

Big Data

Data

Databases

IoT

DZone's Featured Data Engineering Resources

The Latest Data Engineering Topics