Microsoft Foundry Cross-Region with Private Endpoints (Part 1)

Microsoft Foundry Cross-Region with Private Endpoints (Part 1)


🎯 TL;DR: Deploy Microsoft Foundry Cross-Region with Private Endpoints

Microsoft Foundry isn’t available in every Azure region, but data residency requirements often mandate that all data at rest stays within specific regions. This post demonstrates how to keep your data in your compliant region (e.g., New Zealand North) while leveraging Microsoft Foundry in another region (e.g., Australia East) purely for AI inferencing. Using cross-region Private Endpoints over Azure’s backbone network, applications securely access Foundry’s AI capabilities without data traversing the public internet—maintaining both regional compliance and zero-trust security posture.

The Solution: All data at rest, applications, and Private Endpoints remain in NZN. Microsoft Foundry deployed in AUE provides AI inferencing only. Private connectivity ensures secure, compliant architecture across regions.


When deploying Microsoft Foundry (formerly Azure AI Foundry) in enterprise environments, you’ll face a critical constraint: Microsoft Foundry isn’t available in every Azure region, yet data residency requirements mandate that all data at rest remains within specific regions.

Imagine this scenario: Your organization must keep all data in New Zealand North due to regulatory compliance, but Microsoft Foundry is only available in Australia East. You can’t move data to AUE, but you need Foundry’s AI capabilities. How do you maintain compliance while accessing AI inferencing services?

The solution is architectural: Keep all data at rest in your compliant region (NZN) and use Microsoft Foundry in the available region (AUE) purely for AI inferencing. By deploying cross-region Private Endpoints, applications in NZN securely access Foundry’s AI services over Azure’s backbone network—no public internet, no data residency violations, no compromises.

This guide walks through the complete architecture, DNS configuration, security considerations, and implementation steps for deploying this cross-region private endpoint pattern.

⚠️ Important: Foundry Agents Service Limitation

If you plan to use the Foundry Agents service specifically, there is a known limitation at the time of writing: all Foundry workspace resources (Cosmos DB, Storage Account, AI Search, Foundry Account, Project, Managed Identity, Azure OpenAI, or other Foundry resources used for model deployments) must be deployed in the same region as the VNet.

This means the cross-region pattern described in this post will not work for Foundry Agents deployments—you would need to deploy everything in the same region (e.g., all resources in Australia East where Foundry is available).

However, if you are NOT using the Foundry Agents service (i.e., you’re only using Foundry for AI inferencing via API calls—OpenAI models, Speech Services, Vision, etc.), then the cross-region private endpoint pattern works perfectly, and all your data can reside in your chosen compliant region as described in this post.

For more details, see Microsoft Learn - Virtual Networks with Foundry Agents - Known Limitations

flowchart TB
    subgraph azure["☁️ Azure Backbone"]
        direction TB
        subgraph NZN["🌏 NZN - Data Residency Region"]
            direction TB
            subgraph vnet["VNet:  10.1.0.0/16"]
                subgraph appsnet["Subnet: snet-apps • 10.1.1.0/24"]
                    client[👤 Client App / VM
10.1.1.10] data[(💾 Data at Rest
Storage, SQL, etc.)] end subgraph pesnet["Subnet: snet • 10.1.2.0/24"] pe[🔒 Private Endpoint
10.1.2.4] end end dns[🔐 Private DNS Zones
Resolves to Private IP] end subgraph AUE["🌏 AUE - AI Inferencing"] foundry[[🤖 Microsoft Foundry
myFoundry. cognitiveservices.azure.com]] end pe ==>|"🔐 Private Link
"| foundry end internet[/"🌐 Public Internet
❌ Blocked"/] client --> dns dns -.->|10.1.2.4| pe client -->|HTTPS| pe foundry -.-x internet style azure fill:#f5f5f5,stroke:#666,stroke-width:2px,stroke-dasharray: 5 5 style NZN fill:#e3f2fd,stroke:#1976d2,stroke-width:3px style AUE fill:#e8f5e9,stroke:#388e3c,stroke-width:3px style internet fill:#ffebee,stroke:#c62828,stroke-width:2px style vnet fill:#e1f5fe,stroke:#0288d1,stroke-width: 2px style dns fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px style pe fill:#fff3e0,stroke:#ef6c00,stroke-width:3px style data fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
Read more
Building Voice Agents with Azure Communication Services Voice Live API and Azure AI Agent Service

Building Voice Agents with Azure Communication Services Voice Live API and Azure AI Agent Service


🎯 TL;DR: Real-time Voice Agent Implementation

This post walks through building a voice agent that connects traditional phone calls to Azure’s AI services. The system intercepts incoming calls via Azure Communication Services, streams audio in real-time to the Voice Live API, and processes conversations through pre-configured AI agents in Azure AI Studio. The implementation uses FastAPI for webhook handling, WebSocket connections for bidirectional audio streaming, and Azure Managed Identity for authentication (no API keys to manage). The architecture handles multiple concurrent calls on a single Python thread using asyncio.

Implementation details: Audio resampling between 16kHz (ACS requirement) and 24kHz (Voice Live requirement), connection resilience for preview services, and production deployment considerations. Full source code and documentation available here


Recently, I found myself co-leading an innovation project that pushed me into uncharted territory. The challenge? Developing a voice-based agentic solution with an ambitious goal - routing at least 25% of current contact center calls to AI voice agents. This was bleeding-edge stuff, with both the Azure Voice Live API and Azure AI Agent Service voice agents still in preview at the time of writing.

When you’re working with preview services, documentation is often sparse, and you quickly learn that reverse engineering network calls and maintaining close relationships with product teams becomes part of your daily routine. This blog post shares the practical lessons learned and the working solution we built to integrate these cutting-edge services.

The Innovation Challenge

Building a voice agent system that could handle real customer interactions meant tackling several complex requirements:

  • Real-time voice processing with minimal latency
  • Natural conversation flow without awkward pauses
  • Integration with existing contact center infrastructure
  • Scalability to handle multiple concurrent calls
  • Reliability for production use cases

With both Azure Voice Live API and Azure AI Voice Agent Service in preview, we were essentially building on shifting sands. But that’s what innovation is about - pushing boundaries and finding solutions where documentation doesn’t yet exist.

Understanding the Architecture

Our solution bridges Azure Communication Services (ACS) with Azure AI services to create an intelligent voice agent. Here’s how the pieces fit together:

graph TB
    subgraph "Phone Network"
        PSTN[📞 PSTN Number
+1-555-123-4567] end subgraph "Azure Communication Services" ACS[🔗 ACS Call Automation
Event Grid Webhooks] MEDIA[🎵 Media Streaming
WebSocket Audio] end subgraph "Python FastAPI App" API[🐍 FastAPI Server
localhost:49412] WS[🔌 WebSocket Handler
Audio Processing] HANDLER[⚡ Media Handler
Audio Resampling] end subgraph "Azure OpenAI" VOICE[🤖 Voice Live API
Agent Mode
gpt-4o Realtime] AGENT[👤 Pre-configured Agent
Azure AI Studio] end subgraph "Dev Infrastructure" TUNNEL[🚇 Dev Tunnel
Public HTTPS Endpoint] end PSTN -->|Incoming Call| ACS ACS -->|Webhook Events| TUNNEL TUNNEL -->|HTTPS| API ACS -->|WebSocket Audio| WS WS -->|PCM 16kHz| HANDLER HANDLER -->|PCM 24kHz| VOICE VOICE -->|Agent Processing| AGENT AGENT -->|AI Response| VOICE VOICE -->|AI Response| HANDLER HANDLER -->|PCM 16kHz| WS WS -->|Audio Stream| ACS ACS -->|Audio| PSTN style PSTN fill:#ff9999 style ACS fill:#87CEEB style API fill:#90EE90 style VOICE fill:#DDA0DD style TUNNEL fill:#F0E68C

Core Components

  1. Azure Communication Services: Handles the telephony infrastructure, providing phone numbers and call routing
  2. Voice Live API: Enables real-time speech recognition and synthesis with WebRTC streaming
  3. Azure AI Agent Service: Provides the intelligence layer for understanding and responding to customer queries
  4. WebSocket Bridge: Our custom Python application that connects these services
Read more
Custom Voices in Azure OpenAI Realtime with Azure Speech Services

Custom Voices in Azure OpenAI Realtime with Azure Speech Services


🎯 TL;DR: Hybrid GPT-4o Realtime with Azure Speech Services Custom Voices

This post demonstrates bypassing GPT-4o Realtime’s built-in voice limitations by creating a hybrid architecture that combines GPT-4o’s conversational intelligence with Azure Speech Services’ extensive voice catalog. The solution configures GPT-4o Realtime for text-only output (ContentModalities.Text) and routes responses through Azure Speech Services, enabling access to 400+ neural voices, custom neural voices (CNV), and SSML control. The implementation includes intelligent barge-in functionality using real-time audio amplitude monitoring, allowing users to interrupt the assistant naturally mid-response.

Technical implementation: C# application using Azure.AI.OpenAI and Microsoft.CognitiveServices.Speech SDKs, NAudio for audio I/O, streaming text collection from GPT-4o responses, RMS-based speech detection with configurable thresholds, and concurrent audio management for seamless interruption handling. Complete C# source code with audio helpers available here


Building realtime voice-enabled applications with Azure OpenAI’s GPT-4o Realtime model is incredibly powerful, but there’s one significant limitation that can be a deal-breaker for many use cases: you’re stuck with OpenAI’s predefined voices like “sage”, “alloy”, “echo”, “fable”, “onyx”, and “nova”.

What if you’re building a branded customer service bot that needs to match your company’s voice identity? Or developing a therapeutic application for children with autism where the voice quality and tone are crucial for engagement? What if your users need to interrupt the assistant naturally, just like in real human conversations?

In this comprehensive guide, I’ll show you exactly how I solved these challenges by building a hybrid solution that combines the conversational intelligence of GPT-4o Realtime with the voice flexibility of Azure Speech Services. We’ll dive deep into the implementation, covering everything from the initial problem to the complete working solution.

flowchart TD
    A[👤 User speaks] --> B[🎤 Microphone Input]
    B --> C{Barge-in Detection
Audio Level > Threshold?} C -->|Yes| D[🛑 Stop Azure Speech] C -->|No| E[📡 Stream to GPT-4o Realtime] E --> F[🧠 GPT-4o Processing] F --> G[📝 Text Response
ContentModalities.Text] G --> H[🗣️ Azure Speech Services
Custom/Neural Voice] H --> I[🔊 Audio Output] D --> E I --> J[👂 User hears response] J --> A style A fill:#e1f5fe style D fill:#ffebee style G fill:#f3e5f5 style H fill:#e8f5e8 style I fill:#fff3e0
Read more