Menu

Welcome to Chong Hu's Blog


Chong Hu

Phone

US: (+1) 917-388-5186
CN: (+86) 158-1154-7619

Email

ch3467@columbia.edu
jackchonghu@gmail.com

personal picture

Education

sample-image

Columbia University (CU)

, New York, US Aug 2019 - Dec 2020

The Fu Foundation School of Engineering and Applied Science

GPA: 4.0/4.0

M.S. in Electrical Engineering
Courses: Database, Algorithm, Computer Networks, Programming Language & Translator, Stream Processing


sample-image

Shanghai Jiao Tong University (SJTU)

, Shanghai, CN Sep 2015 - Aug 2019

Joint Institute: University of Michigan-Shanghai Jiao Tong University Joint Institute (UM-SJTU JI)

GPA: 3.3/4.0

B.S. in Electrical and Computer Engineering; Minor in Data Science
Courses: Data Structures and Algorithms, Operating System, Methods and Tools for Big Data, AI Techniques

Work Experience

Megvii INC.

, Beijing, CN Mar 2021 - Now

Research & Development Engineer, Transformer Team of Research Institute

Sep 2021 - Now

Research & Development Engineer, Video Team of Research Institute

Mar 2021 - Sep 2021

CertiK LLC.

, New York, US May 2020 - Aug 2020

Research & Development Intern, R&D Team

  • Designed and built a task management system to connect ethereum and cosmos chain through websockets and handle tasks with multiprocessing in Golang; used DynamoDB as a cached database to provide data for front-end
  • Constructed multiple strategies to combine security check logic and established endpoints with Lambda Function
  • Provided RESTful APIs and a command line interface for task management
  • Wrote unit tests for the functionality verification, logging mechanism and error handler to improve the system robustness

MokaHR INC.

, Beijing, CN Dec 2018 - Apr 2019

Software Engineer Intern, AI Team

  • Combined CTPN and CRNN and developed model to solve OCR problems (Chinese & English) in resume images using TensorFlow; simplified Network Structure and sped up inference time 2s/10s on average, lost only 2% accuracy
  • Adapted open source labeling software to mark text and run evaluation and unit test for different stages
  • Packaged model into web service using gunicorn and Flask, provided API and deployed on Alibaba cloud
  • Implemented cache mechanism with Redis and multistage recognition with high accuracy (over 90% per label)
  • Improved 15% overall performance and about 200% QPS over the original third-party service with parallel processing in Python

Beijing Infervision INC.

, Beijing, CN Jan 2018 - May 2018

Software Engineer Intern, Modeling Team

  • Applied YOLO V2 & V3 under darknet frame and FPN under MXNet for illness detection on DR images
  • Calculated anchor size and number in different methods for YOLO and combined three detection layers to improve accuracy by roughly 5% on tiny objects
  • Utilized Focal Loss to focus on cases with fewer samples with TensorFlow; increased average accuracy by about 3%
  • Connected recognition model to back-end inside docker and fixed bugs about medical image in data pipeline

Projects

Cloud Travel Planner Platform Using AWS Services

Oct 2020 - Dec 2020

Team Member, CU

Course: Cloud Computing & Big Data
  • Constructed a platform to provide attraction recommendations, schedule arrangement, and collecting partners’ opinions by using plenty of AWS services
  • Implemented multiple Lambda Functions to provide different functionalities with different components with API Gateway
  • Integrated Cognito to manage user identities; Stored service data in DynamoDB; Established ElasticSearch for attractions
  • Applied AWS Amplify to provide automatically CI/CD with CloudFront and CodePipeline
  • Implemented a GUI for user to interactive with friends in real time and a chatroom for each group using GraphQL
sample-image

IceBerg: Commercial Reimbursement System

Oct 2020 - Dec 2020

Team Member, CU

Course: Advanced Software Engineering
  • Constructed reimbursement system by using SpringBoot as backend framework, MyBatis as connector to database, MySQL and Amazon S3 as backend database
  • Provided multiple functionalities so that employees can request for reimbursement with invoice images, manager can review and exam requests and so on
  • Integrated external APIs such as email sending API to notify employees about reimbursement process, PayPal API to mock real transfer process, S3 bucket as image storage, OCR API to provide auto filling through invoice, OAuth as quick login
  • Applied Maven as package manager, JUnit to provide unit testing, SpotBug as bug finder, jacoco as coverage reporter and GitHub Action to provide auto CI
sample-image

Rule-based Marketing Platform to Manage Call Detail Record (CDR)

Mar 2020 - May 2020

Team Member, CU

Course: Large-scale Stream Processing
  • Simulated streaming CDR data in a generator with real-time interface to change modes, speed, distribution, etc
  • Built Pub/Sub scheme using Redis as Message Queue and set up a middle-ware to provide stream to Spark streaming
  • Provided multiple customizable templates to extract features; modularized streaming process and optimized with operator reordering, state sharing and other optimization algorithms; reduced about 30% CPU resource on average
  • Implemented a GUI application to visualize real-time streaming features and to receive live updates for Django back-end
sample-image

Programming Language and Translator Design for Smart Contract

Mar 2020 - May 2020

Team Member, CU

Course: Programming Language & Translator
  • Designed lexical convention and content free grammar for our smart contract language
  • Implemented parser and semantic check using Ocaml; translated our semantically checked AST to Minic IR; provided pretty printing function in each stages; converted code in our language to bytecode for EVM; provided unit tests
  • Built EVM using ganache-cli package and tested compiled program in bytecode using Javascript on this EVM

Web Application for Video Object Segmentation and Visualization

Sep 2019 - Dec 2019

Team Member, CU

Course: Big Data Analytic
  • Adapted OSVOS model to segment foreground object from short video; applied FFmpeg and OpenCV to extract single frames from video, mask with recognized foreground area, and render to video; calculated position of segmented object
  • Provided web API to communicate video and corresponding metadata with front-end by using Flask
  • Built Django web application to receive video files, play rendered video, visualize metadata of foreground object
sample-image

High Dynamical Range (HDR) Video Recovering Algorithm

August 2018 - December 2018

Deputy Team Leader, SJTU; Company Sponsor: OTC/SSG/Intel

Graduation Project
  • Used hdrcnn model to train data to transform LDR to HDR, with different data enhancements and loss functions, eg, cosine loss, in order to reconstructed over exposed area and restored details in dark area
  • Evaluated model performance using HDR-VDP v2 and obtained 20/100 more than traditional mathematical method
  • Applied FFmpeg, OpenEXR to finish the transfer from LDR video to image and image to HDR video, and added meta data of HDR10 format and corresponding BT2020 curves
sample-image

Different Machine-Learning Methods to Predict Movie Popularity

August 2019 - December 2019

Team Member, CU

Course: Statistical Learning
  • Analysed dataset and visualized features of conventional and social media features
  • Reproduced the results of the original paper by recreating all the model using R code
  • Implemented multiple models to predict movie ratings, such as linear regression, LDA&QDA, Naive Bayesian, Decision Tree, SVM and so on

Music Recommendation System Analyzed from Million Song Dataset (MSD)

June 2019 - August 2019

Team Member, SJTU

Course: Methods and Tools for Big Data
  • Extracted songs information from 160GB MSD and preprocessed data using hadoop and drill
  • Built similar artist adjacent matrix using MapReduce in hadoop and Spark and compared two methods
  • Used Naive Bayes to guide the scaling data and investigated features inside data using clustering methods
  • Constructed the pipeline of music recommendation based on adjacent matrix and features

Bayesian Analysis based on Sample Point Data Using Julia

June 2019 - August 2019

Team Member, SJTU

Course: Bayesian Analysis
  • Plotted graphs of the situation and information using matplotlib to understand data and made basic assumptions
  • Proposed bayesian and mathematical model based on observations and corresponding parameter simulation methods.
  • Implemented Kmean++ clustering in Julia to investigate the location of trees and inferred new data to analysis error
sample-image

Deep Learning Face Super-Resolution with Facial Prior

May 2019 - August 2019

Team Member, SJTU

Course: CNN for Visual Recognition
  • Re-implemented FSRNet in TensorFlow to reconstruct face high resolution images
  • Dealt with facial landmarks in Helen Dataset and generated and augmented data with different methods
  • Illustrated the significance of using facial prior in face SR tasks with three metrices, MSE, SSIM, PSNR

Multi-threaded and Efficient Programming in Database

Oct 2018 - Dec 2018

Team Member, SJTU

Course: Operating System
  • Implemented table management queries and data manipulation in C++ and handled exceptions in query error
  • Accelerated database using multi-threading and optimizing data structure, enabling 50% faster speed

Model Analysis of Effect Factors on Rental Price

June 2018 - August 2018

Individual Project, SJTU

Course: Applied Regression Analysis using R
  • Downloaded 958 Mb data from Airbnb and then cleaned, classified and processed data by R
  • Used GAM and GLM to match and explain data; checked model assumptions, outliers, high leverage points and so on to achieve an excellent model effect

MCM/ICM: Problem B: Turnpike Toll Plaza Model Based on Queuing Theory

February 2017

Team Leader, SJTU

  • Built inflow model and merging queue model to generate the incoming cars and calculate the throughput when vehicles exit the toll plazas
  • Designed two new toll plazas: reversible plaza and separate plaza to improve the efficiency
  • Used C++ to realize visualization and MATLAB to generate data and found solutions to different conditions

Technical Skills

Programming Language:

Python, C++, C, Golang, R, Java, MATLAB, Julia, OCaml, Javascript, HTML&CSS, SQL, Verilog, Solidity

Toolkits/Frameworks:

Linux, Hadoop, Spark, Git, NumPy, pandas, TensorFlow, Matplotlib, OpenCV, Flask, Django, LaTex

Activities

Deputy Director

, Public Information Department, Student Union of UM-SJTU JI July 2016 - July 2017

Volunteer Teacher

, Elementary School in Yunnan Province December 2016 - January 2017

Interests:

Photography, Football

Contact

Phone

US: (+1) 917-388-5186
CN: (+86) 158-1154-7619

Email

ch3467@columbia.edu
jackchonghu@gmail.com