Missing values in financial time series data are of paramount importance in financial modeling and analysis. Appropriately handling missing data is essential to ensure the accuracy and reliability of financial models and forecasts. In this paper, we focus on datasets containing multiple attributes of different firms across time, such as firm fundamentals or characteristics, which can be represented as three dimensional tensors with the dimensions time, firm and attribute. Hence, the task of imputing missing values for these datasets can also be formulated as a tensor completion problem. Tensor completion has a wide range of applications, including link prediction, recommendation, and scientific data extrapolation. The widely used completion algorithms, CP and Tucker decompositions, factorize an N-order tensor into N embedding matrices and use multi-linearity among the factors to reconstruct the tensor. Real-world data are often highly sparse and involve complex interactions beyond simple N-order linearity; they demand models capable of capturing latent variables and their non-linear multi-way interactions. We design an algorithm, called Non-Linear Matryoshka Tucker Completion (NMTucker), that uses element-wise Tucker decomposition, multi-layer perceptrons, and non-linear activation functions to solve these challenges and ensure its scalability. To avoid the overfitting problem with existing neural network-based tensor algorithms, we develop a novel strategy that recursively decomposes a tucker core into smaller ones, reduces the number of trainable parameters, and regularizes the complexity. Its structure is similar to Matryoshka dolls of decreasing size in which one is nested inside another. We conduct experiments to show that NMTucker effectively mitigates overfitting and demonstrate its superior generalization capability (up to 53.91% less RMSE) in comparison with the state-of-the-art models in multiple tensor completion tasks.