Available Offers for Data Science

Monitoring and Observability Engineer

Full-time
Remotely

This role involves designing, implementing, and managing comprehensive monitoring solutions using Prometheus, Grafana, SNMP-Exporter, Streaming Telemetry, OpenTelemetry, and other related technologies.


Responsibilities

- Design, implement, and manage Prometheus-based monitoring solutions, including configurations and alert rules.

- Develop and maintain interactive and visually appealing Grafana dashboards.

- Configure SNMP modules/jobs to scrape SNMP metrics for different network technologies in a very optimized way.

- Strong knowledge of Git to be able to clone working branches, develop, and commit to the main branch. Or other approaches, but show a strong hold on Git usage.

- Identify and onboard new metrics from various systems and applications, developing data pipelines for metrics collection and storage.

- Optimize and scale monitoring environments to handle large volumes of metrics and ensure comprehensive monitoring coverage.

- Implement and manage Streaming Telemetry solutions for real-time data collection and monitoring.

- Integrate and manage OpenTelemetry for comprehensive tracing and observability across services.

- Troubleshoot and resolve issues related to data collection, monitoring configurations, and dashboard performance.

- Ensure proper instrumentation of applications and infrastructure with DevOps, development, and operations teams.

- Document configurations, procedures, and provide training to team members and stakeholders.

 

Skills

- Familiarity with network monitoring tools and practices.

- Extensive experience with Prometheus and related technologies (Alertmanager, Pushgateway, etc.).

- Strong knowledge of time-series databases and monitoring concepts.

- Proficiency in writing Prometheus queries (PromQL).

- Strong experience with Grafana and its ecosystem.

- Proficiency in creating and managing Grafana dashboards and panels.

- Knowledge of data visualization principles and best practices.

- Familiarity with monitoring and observability tools and practices.

- Strong knowledge of SNMP protocols and network device management.

- Experience with SNMP-Exporter and its integration with Prometheus.

- Strong in SNMP module creation and scrape congas for various network technologies.

- Strong Git experience.

- Strong understanding of metrics and monitoring concepts.

- Experience with metrics collection tools (Prometheus, Telegraf, Collectd, etc.).

- Experience with Streaming Telemetry solutions for real-time monitoring.

- Experience with OpenTelemetry for tracing and observability.

- Familiarity with Linux/Unix systems and scripting languages (Bash, Python).

- Experience with containerization and orchestration tools (Docker, Kubernetes).

 

Qualification 

- Bachelor’s degree in Computer Science, Engineering, or related. 

- 5+ years of experience in monitoring and observability roles.

- Proficiency in tools like Prometheus, Grafana, PromQL, Alertmanager, Alert Framework, GitHub, SNMP-exporter, Streaming-Telemetry, Otel.

- Strong coding and scripting skills.

- Excellent problem-solving abilities and attention to detail.

- Strong communication and teamwork skills.

 

Lead ML Engineer

Remotely
Full-time

Responsibilities 

• Evaluate and adapt state-of-the-art machine learning (ML), computer vision (CV), generative AI, and time series forecasting algorithms to meet product and client objectives. 

• Research, design, and implement innovative ML algorithms for image, video, multimodal, and temporal data. 

• Architect and develop full-stack ML pipelines—from data acquisition and preprocessing to training, evaluation, and deployment in cloud (AWS) or edge environments. 

• Prototype and validate proof-of-concept (POC) solutions for vision, generative AI, and time-series forecasting problems. 

• Translate customer requirements into actionable tasks, ensuring a clear understanding of objectives, scope, and expected outcomes. 

• Analyze structured and unstructured data to uncover trends, patterns, and anomalies. Apply ML and statistical methods for prediction and forecasting. 

• Prepare detailed technical documentation, reports, and presentations for internal and external stakeholders. 

• Communicate complex technical topics effectively to both technical and non-technical stakeholders, including clients and business partners. 

• Lead projects from prototype to production, ensuring scalability, reliability, and performance of solutions. 

• Contribute to internal software development processes and team collaboration initiatives. 


Requirements 

• Strong hands-on experience in delivering ML solutions, including production-grade computer vision and forecasting models. 

• Proven expertise in forecasting and time series data handling (e.g., ARIMA, LSTM, temporal convolutional networks). 

• Proficiency in image and video processing, including segmentation, pose estimation, object detection, and multimodal data fusion. 

• Experience with generative AI models such as diffusion-based text-to-image/video, multimodal LLMs, and prompt engineering. 

• Skilled in reading, interpreting, and applying insights from academic research papers. 

• Expertise in deep learning frameworks like PyTorch or TensorFlow. 

• Strong object-oriented programming skills with clean, production-quality Python code.

• Familiarity with Vision Transformers (ViTs), especially for action recognition, object tracking, and video understanding tasks. 

• Cloud deployment experience, particularly with AWS. 

• Excellent communication skills in English (C1 or higher), both written and spoken. 

• Strong ability to work independently, prioritize tasks, and manage multiple projects simultaneously. 

Nice to Have 

• Master’s or Ph.D. degree in Machine Learning, Computer Science, Mathematics, or a related field.

• Contributions to open-source ML or CV libraries or participation in Kaggle competitions.