📡 Telecom Industry Overview
›What is Telecom and why DevOps engineers are in high demand
Telecom (telecommunications) is the infrastructure that carries every phone call, mobile data packet, and internet connection worldwide. A major telecom operator like Vodafone or Airtel manages hundreds of thousands of network elements — base stations, routers, switches, fibre cables — across an entire country.
The shift happening right now: network functions that used to run on proprietary hardware boxes are moving to software running on Kubernetes. A 5G core network is now microservices. A network management platform like TeMIP runs on OpenShift. This is exactly why DevOps engineers with container and cloud skills are being hired by telcos.
How a telecom network is structured
| Layer | What it is | Example |
|---|---|---|
| Radio Access Network (RAN) | Base stations, antennas — what your phone connects to | 4G/5G towers, small cells |
| Transport Network | Fibre and microwave links carrying data between sites | Transmission links, GPON |
| Core Network | Routing, switching, subscriber management | 4G EPC, 5G core (AMF, SMF, UPF) |
| OSS | Operations Support Systems — manage the network | TeMIP, UOC, UTM |
| BSS | Business Support Systems — manage the business | Billing, CRM, order management |
⚙️ OSS and BSS — The Two Pillars
›OSS — Operations Support Systems
OSS manages the technical operation of the network. Think of it as the IT systems that the engineering and NOC (Network Operations Centre) teams use to keep the network running.
- Fault Management — detect, correlate, and resolve network alarms. When a fibre is cut, thousands of alarms fire. Fault management correlates them to show one root cause instead of 10,000 symptoms. TeMIP does this.
- Configuration Management — track and push configuration to network elements. What software version is running on base station 4532? What parameters are configured? CMDB (Configuration Management Database) answers this.
- Performance Management — collect KPIs from network elements. Call drop rate, packet loss, throughput, latency per site. Feeds dashboards for NOC engineers.
- Inventory Management — what hardware exists, where, connected to what. UTM manages this for RAN, GPON, Transmission domains.
- Provisioning — activate new services automatically. New SIM card activated → OSS configures the network to allow that SIM to connect.
BSS — Business Support Systems
BSS manages the customer and revenue side of the telco.
- CRM — customer records, service history, complaints, churn prediction
- Billing and Revenue Management — rate every voice minute, SMS, data MB. Generate bills. Process payments. Handle disputes.
- Order Management — new service activation, plan changes, porting numbers, cancellations
- Product Catalogue — all available tariff plans, bundles, add-ons
🖥️ TeMIP — HPE Telecom Management Platform
›What is TeMIP?
TeMIP (Telecommunications Management Platform) is HPE's fault management solution for telecom networks. It is a Manager of Managers (MoM) — it sits above individual Element Management Systems (EMS) and consolidates alarms from all of them into a single view. A major telco might have Nokia, Ericsson, Huawei, and ZTE equipment — TeMIP receives alarms from all of them through Access Modules and presents a unified view to NOC operators.
vTeMIP — the virtualised version
vTeMIP is the cloud-native version that runs on Kubernetes/OpenShift. It is a microservices architecture:
| Component | What it does | Technology |
|---|---|---|
| TNS | TeMIP Naming Service — lookup service for all managed network entities. Gives every network element a unique name. Distributes and replicates name data for HA. | Distributed database |
| ACS FM | Alarm Collection Server — subscribes to alarm collections on Operation Contexts (OCs). Multiplexes multiple sources to single subscriptions. Reduces load on each OC. | Java microservice |
| TWS | TeMIP Web Services — North Bound Interface over SOAP/REST. External applications (UCA, UOC) access TeMIP through TWS. URL: http://IP:7180/TeMIP_WS | Java/Tomcat, axis2 |
| UMB | Unified Mediation Bus — event messaging backbone based on Kafka. All alarms flow through UMB between components. | Apache Kafka |
| FAS | Fault Archival and Statistics — archives alarm data and provides statistical analytics. Transforms raw fault data into actionable information. | Analytics engine |
| Access Modules | Adapters that collect alarms from specific vendor EMS/NMS. CRB (TCP-based) for Nokia Netact, Huawei U2000. IST (UDP/SNMP) for ZTE U31. Confluent AM for DU data pipeline. | Per-vendor adapters |
| AutoPass | HPE licence management server — TeMIP checks licences here | Licence service |
| RHGS | RedHat Gluster Storage — distributed file system for vTeMIP data | Distributed storage |
Key vTeMIP concepts
- Operation Context (OC) — a named scope of managed entities. Like a folder that contains network elements. ACS creates alarm collections on OCs.
- Resynchronisation (Resync) — process to re-fetch current alarm state from the network. Used after a connection failure. Note: resync does not support alarms above 64KB in size. In DU, a dedicated resync adapter handles this for all integrations.
- Confluent Adapter — in HPE DU deployment, TeMIP does NOT connect directly to EMS. Instead, a Confluent (Kafka) adapter receives alarm data from Data Factory and feeds TeMIP. This decouples the network-facing systems from the management platform.
Managing vTeMIP in production
# Check vTeMIP application status manage show mcc 0 appl temip_web_services all att manage show mcc 0 appl acs_fm all att # Start / Stop TWS manage stop mcc 0 appl temip_web_services manage start mcc 0 appl temip_web_services # Check UMB (Kafka) status umb -c prod_umb1 status umb -c prod_umb1 stop umb -c prod_umb1 start # Troubleshoot alarms not updating in UOC # 1. Check OSSM processes systemctl status ossm # 2. Check TeMIP adapter umb -c prod_umb1 status # 3. Check ACS_FM and TWS manage show mcc 0 appl acs_fm all att # 4. Check alarm count in H2 database java -cp $OSSM_HOME/lib/h2-*.jar org.h2.tools.shell -user sa -url jdbc:h2:tcp://localhost:9093/mem:uocCenterPool # SQL: select operation_context, count(*) from RANALARM group by operation_context;
🖥️ UOC — Unified OSS Console
›What is UOC?
UOC (Unified OSS Console) is HPE's web-based operations portal — the single pane of glass that NOC engineers look at. While TeMIP is the alarm engine underneath, UOC is the interface that operators interact with. It provides: alarm views, dashboards, trouble ticket creation, map-based views of sites.
UOC Architecture
| Component | Role |
|---|---|
| UOC Core | Main application server — serves the web UI. Multiple instances for HA. Each instance: meyclvsuocwebh01-04 |
| UOC AM | Alarm Manager — receives alarms from TeMIP adapter and stores in H2 in-memory database (port 9093) |
| Keycloak | Identity and Access Management — SSO, LDAP integration, user authentication |
| OSSM | OSS Manager — the service that manages UOC components. Start/stop/status via systemctl |
| H2 Database | In-memory database for alarm storage (port 9093) and dashboard data (port 9100) |
| TeMIP Adapter | Bridge between TeMIP/UMB and UOC — transforms TeMIP alarm format to UOC format |
Custom Dashboards in UOC
In HPE DU deployment, several custom dashboards were built:
- VIP Sites Map — shows VIP site alarm status on a geographic map. Data: BLACKOUT_SITES table collects site inventory from UTM + alarm data from UOC TEMIP_ALARM table.
- PRB Utilisation / RF KQI / High UL BLER Map — radio performance dashboards. Separate H2 instance on port 9100. Node Types: SitePRB, SiteRFKQI, SiteULBLER.
- Mobile Network Alarm Search / SA Network Alarm Search — operator alarm query tools
- VIP Event Monitoring Filter — filters to show only VIP site events
Custom Trouble Ticket (TT) in UOC
Customised TT creation: Location → Site Name, Faulty Entity → System Name. Customer-specific dropdowns for: Technology Domain, TT Group Name, Service Impact, Subscriber Impact, Urgency. TT Summary auto-populated from Alarm Name + Information1 field.
Java Transformations
Custom Java transformation code was written for: Alarm Duration (clearance timestamp - creation timestamp), Core System Name (copy from System Name), TT Summary (Alarm Name + Information1). These run as transformation JARs in the OSSM pipeline.
UOC Operations
# OSSM (manages all UOC components) systemctl status ossm systemctl start ossm systemctl stop ossm # UOC Core web application systemctl status uoc2@3000 systemctl start uoc2@3000 systemctl stop uoc2@3000 # Health checks # 1. UOC Core running: curl http://localhost:3000/uoc/ # 2. UOC AM running: check temip-adapter process # 3. CPU/Memory: free -gh && top 1 c m # Troubleshoot - alarms not updating in UOC # Check all OSSM processes: systemctl status ossm # Check temip-adapters: umb -c prod_umb1 status # Check vTeMIP: manage show mcc 0 appl acs_fm all att # Check alarm count: connect to H2 on port 9093 (see above) # Troubleshoot - dashboard data missing # Check OSSM running # Check H2 DB on port 9100 is running # Check cron job updating BLACKOUT_SITES table # Check log file for script output and errors # Check data count in BLACKOUT_SITES for each dashboard
🗺️ UTM — Unified Topology Manager
›What is UTM?
UTM (Unified Topology Manager) is an integration platform that provides consolidated network configuration data (inventory) to other OSS products: vTeMIP, UCA, UOC. It synchronises network configuration data and manages multiple domains: RAN (Radio Access Network), GPON (fibre), Transmission (IP/microwave).
UTM Directory Structure
/opt/UTM/— ETL rules, source/target models, CLI management, stop/start/configure UTM service/var/opt/UTM/— internal UTM runtime — user configuration and features
How UTM works — ETL Pipeline
| Phase | What happens | Views created |
|---|---|---|
| Collection | Collectors get network configuration from Input Tables | A_RAN_INVENTORY, A_GPON_INVENTORY, A_TX_INVENTORY |
| Population | Populates target models from collected views | Normalised inventory data |
| ETL Rules | Transform and load rules that map source to target models | Configured per domain |
The VIP Sites and PRB dashboards in UOC use BLACKOUT_SITES table which is populated from UTM inventory + UOC alarm data. UTM provides the site inventory (which sites exist, their coordinates, type), and UOC provides the current alarm state for each site.
UTM in the DU Data Flow
UTM sits in the middle of the OSS stack:
Network Elements (RAN/GPON/TX) → Input Tables → UTM Collection → UTM Population → UOC Inventory → VIP/PRB Dashboards
🔗 UCA — Unified Correlation and Automation
›What is UCA?
UCA (Unified Correlation and Automation) provides topology-based, cross-domain event correlation, root cause analysis (RCA), service impact analysis, and problem remediation automation. It answers: this alarm on base station X — which customers does it affect? And can we automatically fix it?
UCA capabilities
- Root Cause Analysis — correlates multiple alarms to identify the single root cause. 1000 alarms → 1 root cause event.
- Service Impact Analysis — given a root cause, which services and customers are affected?
- Automated Remediation — trigger automated fix actions when specific alarm patterns are detected
- Topology-aware — understands network topology (which elements connect to which) to calculate impact accurately
📶 NFV, SDN and 5G
›The transformation happening in telecom right now
Traditional telecom ran on dedicated hardware — a proprietary box that did one specific job (a router, a base station controller, a signalling server). Each box cost millions, required specialist engineers, and took months to deploy. Upgrading functionality meant replacing hardware.
NFV — Network Functions Virtualisation
NFV moves network functions from proprietary hardware to software running on standard x86 servers. A function that used to need a dedicated box now runs as a virtual machine or container on commodity hardware. Key benefits: deploy in hours not months, scale horizontally, upgrade via software release not hardware replacement.
| Traditional (physical) | NFV (virtualised) |
|---|---|
| Dedicated hardware per function | Software VNF on standard servers |
| Deploy in months | Deploy in hours (or CI/CD pipeline) |
| Scale by buying more hardware | Scale by spinning up more VMs/containers |
| Upgrade by hardware replacement | Upgrade by software deployment |
| Vendor lock-in | Open standards, multi-vendor |
5G Core — cloud-native microservices
The 5G core network is designed from the ground up as cloud-native microservices. Each function is a separate service with an API:
- AMF (Access and Mobility Management) — handles UE registration and mobility
- SMF (Session Management) — manages data sessions
- UPF (User Plane Function) — actually forwards user data packets
- UDM (Unified Data Management) — subscriber database
- NRF (Network Repository Function) — service discovery for the 5G core
Nokia, Ericsson, and Samsung ship their 5G core as containers running on Kubernetes or OpenShift. Deploying a new 5G feature = Helm upgrade. Monitoring = Prometheus + Grafana. This is your world.