Sign in

Both PyTorch and Tensorflow are widely used in deep learning community. They are becoming more similar and converting one from the other is quite straightforward, becuase most functions in both frameworks have similar arguments or behavior. However, I found torch.gather and tf.gather_nd are quite different. Both functions do the same work but they differ in how to use them. I will summarise the difference.

How torch.gather works

torch.gather(input, dim, index, *, sparse_grad=False, out=None) → Tensor

First, you should choose which dimension you want to gather the elements. Then you choose which element in the selected dimension you would gather. Also, index should have…

데이터사이언스 대학원에 입학한지 2학기째가 되었고, 심화 과목이라고 할 수 있는 확장형 고성능 컴퓨팅 수업을 듣게 되었다. 최근의 데이터사이언스 또는 인공지능 트렌드는 많은 데이터와 Over-parameterize된 알고리즘을 통해서 성능을 끌어올리는 방향인데, 이를 위해서는 컴퓨팅 리소스의 효과적인 활용이 필수적이다. 관련 학회 등을 보면 정말 다양한 아이디어들이 쏟아져 나오고, 비슷한 아이디어를 생각했다 하더라도 빠른 시간 안에 다양하게 실험 및 검증을 진행하지 못해 인정받지 못하는 경우가 많다. 따라서 제한된 리소스를 최대한 낭비없이 잘 활용하는 것이 중요하고, 이번 수업에서는 이 목적을 달성하기 위한 제반 지식 …

This is the last story of this series.

Export trained model(Checkpoint) to Tensorflow graph

If you have successfully trained your model, you might get checkpoint files in the model_dir directory(the directory you set in the training command) Checkpoint consists of 3 different files(data, index, meta) per each checkpoint timing and they are distinguished by their extension. You may see sample checkpoint as below. This is the checkpoint created after 43450 epochs of training.

Now you should export the frozen tensorflow graph from the checkpoint files. You can use below command for this work.


In this post, I will talk about some code changes I made, for customization of TF1 Object detection API. If you are satisfied with the default code, this wouldn’t help you a lot.

Problem statement

I wanted to train a mask detector model from face detector. In order to do this, I had to increase the number of classes from 1 to 2. It was easy to make changes in the configuration file, but training did not go as I wanted.

I wanted to copy the same weight for face class and use it for both of my classes(mask/nomask). But the problem…

Up to step 2, the most process was focused on input preparation. Finally, this post will talk about actual training step.

Prepare a pipeline configuration file

You can set training configuration in pipeline config file. It is not easy to prepare a pipeline config file from scratch, so I recommend to go to Model zoo, download the base model you want and start from there. My base model is facessd_mobilenet_v2_quantized_open_image_v4.

If you download and extract the model, you will find a config file named pipeline.config. But at first sight, it is not easy to find where you should modify for your model.

I modified few…

You can train your object detection model on your machine or Google Colab. I chose to run it on my local desktop and my school’s GPU server.

Clone git repository and start Docker

If you decided to train your model on your machine, the first thing to do is clone the TF1 object detection git repository.

git clone

Then go to cloned folder and start docker.

# From the root of the git repository
docker build -f research/object_detection/dockerfiles/tf1/Dockerfile -t od .
docker run -it od

Then you will see something like this

Create tfrecord files

For efficient input of training data, Tensorflow uses its own format called TFRecord…

Collection of some useful docker commands

How to use docker without sudo: Create a new group docker and add your user to docker:

groupadd docker
usermod -aG docker $USER
newgrp docker

Create docker image from dockerfile script. This uses “research/object_detection/dockerfiles/tf2/Dockerfile” as script and name the image as od

docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od .

Start docker container from image(od) as interactive mode(-it)

docker run -it od

Start exited container and enter into interactive mode

docker start -a -i [Container id]

See all running docker containers

docker container ls

See all docker containers(Including stopped containers)

docker container ls -a

During the summer of 2020, when COVID-19 swept the world, I, with my classmates, developed a mask detection neural network model. Our ultimate goal was to train a mask detection model that can tell if a person wears a mask or not, and also can run in Google Coral — a recently released edge device making use of TPU(Tensor Process Unit). I will briefly explain end-to-end process in this blog. We used Tensorflow Object detection API and here is the link.

Tensorflow object detection API provides the converting function that converts some known image data structures(e.g. PASCAL VOC dataset…

Linux로 서버 또는 다른 기기로 접속을 할 때, 또는 github를 사용할 때 SSH를 많이 사용하는데 쓸때마다 헷갈려서 설정을 할때마다 Public Key가 뭔지 Private Key가 뭔지 헷갈려서 정리해둔다.

이번에 Google Coral로 데스크탑에서 ssh를 이용하여 접속할 수 있도록 설정을 해야 했는데, key를 Coral에서 만들어야 하는지 아니면 접속하고자 하는 컴퓨터에서 만들어야 하는지 혼동이 되었다.

정리하면 아래와 같이 접속하고자 하는 컴퓨터에서 키를 만든 후(ssh-keygen 이용), Public Key를 코랄의 ~/.ssh/authorized_keys에 추가해 주면 된다.

SSH의 작동원리도 간단히 설명하면 아래와 같다.

위와 같이 Public Key가 서버에 저장되어 있는 상태에서 Client가 서버로 접속을 하려고 하는 상황을 생각하자.

  1. 서버는 난수값을 만들고 해시 값을 저장한다.
  2. 서버는 해시 값을 Public Key(authorized_keys에 저장되어 있는)로 암호화하여 클라이언트로 전송한다.
  3. 클라이언트는 Private key로 전송된 값을 복호화한다. Public key로 암호화된 값은 Private key로만 복호화할 수 있다.
  4. 복호화된 값을 서버로 다시 보내고 서버는 이 값이 최초의 해시 값과 일치하면 인증을 완료한다.


Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store