input image size: (h, w) kernel size: k stride: s padding: p
output image size: oh = (h + 2 * p - k) / s + 1 ow = (w + 2 * p - k) / s + 1